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Approximation  by  Ridge  Functions 
and  Neural  Networks  * 

Pencho  P.  Petrushev 


Abstract 

We  investigate  the  efficiency  of  approximation  by  linear  combinations  of 
ridge  functions  in  the  metric  of  -0(B  d )  with  Bd  the  unit  ball  in  Rd.  If  Xn 
is  an  n-dimensional  linear  space  of  univariate  functions  in  -O(-05  I  =  [ —  1 , 1] , 
and  fi  is  a  subset  of  the  unit  sphere  Sd_1  in  Rd  of  cardinality  m,  then  the 
space  Yn  :=  span{r(x  •  0  :  r  £  Xn,uj  £  0}  is  a  linear  space  of  ridge  functions 
of  dimension  <  mn.  We  show  that  if  Xn  provides  order  of  approximation 
0{n~r)  for  univariate  functions  with  r  derivatives  in  -O(-05  and  ^  are  properly 
chosen  sets  of  cardinality  then  Yn  will  provide  approximation  of  order 

0(n~r~d/2+ 1/2)  for  every  function  /  £  ^(B^)  with  smoothness  of  order  r  + 
d/ 2  —  1/2  in  ^(B^).  Thus,  the  theorems  we  obtain  show  that  this  form  of 
ridge  approximation  has  the  same  efficiency  of  approximation  as  other  more 
traditional  methods  of  multivariate  approximation  such  as  polynomials,  splines, 
or  wavelets.  The  theorems  we  obtain  can  be  applied  to  show  that  a  feed¬ 
forward  neural  network  with  one  hidden  layer  of  computational  nodes  given 
by  certain  sigmoidal  function  a  will  also  have  this  approximation  efficiency. 
Minimal  requirements  are  made  of  the  sigmoidal  functions  and  in  particular 
our  results  hold  for  the  unit-impulse  function  a  =  x^0  OQ^- 

Keywords  and  phrases:  approximation  error,  ridge  functions,  neural  net¬ 
works. 

AMS  classification:  41A15,  41A25,  41A29 
Abbreviated  title:  Approximation  by  Ridge  Functions 


1  Introduction 

A  ridge  function  is  a  multivariate  function  of  the  form  r(x-cu),  where  r  is  a  univariate 
function,  w  is  a  fixed  vector  in  Rd,  the  variable  x  £  Rd,  and  x  •  to  is  the  inner 
product  of  x  and  uj.  These  functions  appear  naturally  in  harmonic  analysis,  special 
function  theory,  and  in  several  applications  such  as  tomography  and  neural  networks. 
In  most  applications,  we  are  interested  in  representing  or  approximating  a  general 
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function  /  on  a  domain  S7  C  Rd  by  linear  combinations  of  ridge  functions.  It  is 
surprising  therefore  that  the  most  fundamental  questions  concerning  the  efficiency  of 
approximation  by  ridge  functions  are  unanswered. 

In  this  paper,  we  shall  consider  approximating  functions  in  L2( Bd),  Bd  the 
unit  ball  in  R/  d  >  2,  by  linear  combinations  of  ridge  functions.  Using  extension 
theorems,  the  set  Bd  can  be  replaced  by  more  general  sets  S7  C  R/ 

Let  Xn  be  a  linear  space  of  univariate  functions  in  U2(/),  /  :=  [—1, 1]  and  let 
Qn  C  Sd_1  be  a  finite  subset  of  the  unit  sphere  Sd_1  in  R/  Then, 

Yn  :=  span{r(x  •  u)  :  r  G  Xn,  u  G  (1.1) 

is  a  space  of  multivariate  ridge  functions  of  dimension  <  njf On,  where  # Qn  is  the 
cardinality  of  Qn.  We  shall  relate  the  approximation  efficiency  of  Yn  to  that  of  Xn 
and  the  distribution  of  the  vectors  of  Qn  in  Sd_1. 

Let  Ws(L2(I))  denote  the  univariate  Sobolev  spaces.  We  say  that  a  sequence 
of  spaces  Xn,  n  =  1,2,...,  dim(A/)  =  n,  provides  approximation  of  order  s  if 

<c(5)n-s||5||^(i2(/)),  g  G  Ws(L2(I))1  (1.2) 

where 

E(g,Xn)L2{I))  ■=  inf  | \g-r\\L2(I) 

TfcA  n 

is  the  error  in  approximating  the  univariate  function  g  in  the  L2(I)  norm  by  the 
elements  of  Xn.  We  denote  similarly  the  multivariate  Sobolev  space  Ws(L2(Hd))  on 
Bd  and  the  approximation  error 

E(f,Yn)L2( b*))  :=  inf  ||/  -  R\\L2(Bd) 

te  1 1  n 

for  any  /  G  L2(Qd).  Our  main  result,  given  in  §8,  shows  that  for  any  sequence  of 
spaces  Xn,  n  =  1,2,...,  which  provide  approximation  of  order  s,  and  for  appropriately 
chosen  sets  Qn  with  $ Qn  =  (9(nd_1),  the  sequence  of  spaces  Yn,n  =  1,2,...,  given  in 
(1.1),  provide  the  following  approximation:  for  A  :=  s  +  (d  —  l)/2, 

E(f,Yn)L2{ Bd)  <  c(X)n-x\\f\\wHL2{Bd)p  f  G  W\L2(Bd)).  (1.3) 

Note  that  there  is  in  a  certain  sense  an  unexpected  gain  in  the  multivariate  approx¬ 
imation  order  s  +  (d  —  l)/2  over  the  univariate  order  s.  This  gain  will  be  explained 
later  (see  §9). 

One  can  generate  the  space  Yn  appearing  in  (1.3)  by  using  very  general  uni¬ 
variate  spaces  Xn  such  as  splines  or  wavelets.  In  particular,  our  results  apply  to 
feed-forward  neural  networks  using  a  very  general  activation  function  a.  A  complete 
discussion  of  the  application  to  neural  networks  is  given  in  §9.  In  this  introduction,  we 
wish  to  illustrate  the  typical  result  by  considering  the  following  simple  example.  Let 
a  =  ^  and  define  Xn  as  the  univariate  space  spanned  by  a(x  —  k/n),  0  <  k  <  n. 
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Then,  defining  Yn  for  this  Xn  as  described  above,  we  obtain  a  space  of  dimension 
0(nd)  of  certain  piecewise  constant  functions.  The  space  Yn  can  be  realized  compu¬ 
tationally  by  a  feed-forward  neural  network  with  (9(nd_1)  computational  nodes.  In 
this  case  (see  §9  for  details),  (1.3)  provides  the  approximation  order  1  +  One 
might  expect  the  estimate  (1.3)  to  be  1  since  we  are  using  piecewise  constants  in  the 
approximation.  As  noted  in  (1.3),  the  gain  of  in  the  approximation  rate  persists 
in  general  (see  also  Theorem  8.2). 

There  is  a  standard  method  in  approximation  theory  (see  [DL,  Chapter  7]) 
which  derives  from  (1.3)  the  estimate 

E(f,  Yn)L2{Bd)  <  c  (ur(f,  n-1)^)  +  \\f\\L2(B^n-r)  ,  /  G  U{ Bd)  (1.4) 

with  ur  the  r-th  order  modulus  of  smoothness  of  /.  In  the  case  that  Yn  contains  all 
polynomials  of  total  degree  <  r  (in  d  variables),  the  last  term  on  the  right  can  be 
eliminated. 

Since  Yn  is  a  linear  space  of  dimension  0(nd )  then  it  follows  from  the  general 
theory  of  n- widths  that  for  all  m  >  0, 

sup  E(f,  Yn)  >  c0n~m  (1.5) 

\\f\\wm(L2('Bd))-1 

with  c0  >  0  a  constant  depending  only  on  m  and  d.  In  this  sense,  the  estimates  (1.3) 
cannot  be  improved. 

We  also  note  that  (1.3)  shows  that,  in  general,  linear  spaces  of  ridge  functions 
are  at  least  as  efficient  as  other  methods  of  multivariate  approximation  such  as  poly¬ 
nomials,  wavelets,  and  splines. 

This  paper  is  an  extension  of  the  results  from  [DOP],  where  we  considered  the 
case  d  =  2.  Throughout  the  paper  we  assume  that  d  >  2,  although  most  of  the 
statements  hold  when  d  =  2. 

The  results  of  this  paper  differ  from  other  work  in  this  field  in  the  following 
respects.  We  are  able  to  begin  with  a  very  general  class  of  univariate  spaces  Xn. 
Other  authors  (most  notably  Micchelli  and  Mhaskar  [MM],  [MM1]  and  Mhaskar  [M]) 
have  also  considered  approximation  problems  of  the  type  treated  here.  The  work  of 
Micchelli  and  Mhaskar  does  not  give  the  best  order  of  approximation.  Mhaskar  [M] 
has  given  best  possible  results  but  only  in  the  case  that  Xn  is  generated  using  a  rather 
restrictive  class  of  sigmoidal  functions. 

Our  results  are,  for  the  present,  limited  to  approximation  in  X2,  and  it  remains 
an  important  open  question  in  ridge  approximation  to  understand  to  what  extent 
results  such  as  those  presented  in  this  paper  are  valid  in  Lp}  p  ^  2. 

It  is  also  an  interesting  question  to  understand  which  sets  Qn  C  Sd_1,  when 
used  in  defining  the  spaces  Yn,  will  provide  the  approximation  order  of  (1.3).  In  the 
case  d  =  2,  as  was  shown  in  [DOP],  n  equally  spaced  points  on  S1  are  the  most 
natural  choice.  There  is  no  direct  analogy  of  equally  spaced  points  in  Sd_1,  d  >  2.  It 
will  become  clear  from  §4  that  any  set  Qn  which  permits  a  cubature  formula  that  is 
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exact  for  spherical  polynomials  of  degree  <  n  and  with  good  localization  properties 
will  provide  spaces  Yn  which  satisfy  (1.3).  Since  we  could  not  find  in  the  literature 
examples  of  such  sets  fin,  we  construct  some  in  §4.  There  should  be  more  elegant 
and  more  natural  constructions  than  ours.  In  some  sense,  one  might  expect  that  a 
natural  quadrature  formula  might  provide  the  analogue  of  equally  spaced  points  in 

S'*"1,  d  >  2. 

We  prove  (1.3)  by  first  understanding  well  the  structure  of  ridge  polynomials. 
Our  main  vehicle  (given  in  §3)  is  a  fundamental  orthogonal  decomposition  of  a  general 
function  /  £  L2( Bd)  into  ridge  polynomials.  This  decomposition  uses  the  univariate 
Gegenbauer  polynomials. 

An  outline  of  this  paper  is  the  following.  The  properties  we  need  about  Gegen¬ 
bauer  polynomials  are  given  in  §2.  In  §3,  we  give  the  fundamental  orthogonal  de¬ 
composition  of  functions  in  L2(Tid)  in  terms  of  ridge  polynomials.  In  §4,  we  give  our 
construction  of  cubature  (quadrature)  formulas.  In  §§5-6,  we  introduce  smoothness 
spaces  (the  Sobolev  spaces)  and  recall  their  characterization  by  polynomial  approxi¬ 
mation.  In  §7,  we  prove  the  main  theorem  about  approximation  by  ridge  functions. 
In  §8,  we  discuss  how  to  improve  the  theorem  of  §7  to  be  more  amenable  to  applica¬ 
tions.  In  §9,  we  give  some  applications  of  our  results,  in  particular  to  feed  forward 
neural  networks. 

Throughout  the  paper,  the  constants  are  denoted  by  c,  Ci,...  and  they  may 
vary  at  every  occurrence.  The  constants  usually  depend  on  some  parameters  (like  the 
dimension  d)  that  will  be  sometimes  indicated  explicitly. 


2  The  Gegenbauer  (ultraspherical)  polynomials 

Special  functions  appear  naturally  when  we  represent  a  general  function  in  terms  of 
ridge  polynomials  as  will  be  done  in  the  next  section.  In  particular,  the  Gegenbauer 
polynomials  will  play  an  important  role  in  this  paper.  In  this  section,  we  shall  present 
the  essential  properties  of  Gegenbauer  polynomials  and  bring  out  their  role  in  the 
Radon  transform.  We  refer  the  reader  to  [E]  and  [Sz]  as  general  references  for  this 
section. 

The  Gegenbauer  polynomials  are  usually  defined  by  the  following  generating 
function 


(i-2(*  +  *v=  EO')*™. 

m— 0 

where  \z\  <  1,  \t\  <  1,  and  A  >  0.  The  coefficients  C^(t)  are  algebraic  polynomials 
of  degree  m  which  are  called  the  Gegenbauer  polynomials  associated  with  A.  The 
family  of  polynomials  {C^}m=o  is  a  complete  orthogonal  system  for  the  weighted 
space  L2(I ,  w),  I  :  =  [—1, 1],  w(t)  :=  w\(t)  :  =  (1  —  t2)A_ 2  and  we  have 
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0,  m  ^  n 
/in, A,  m  =  n 


Cm(t)Cn(t)w(t)  dt  = 


with  hn^\  :  = 


7r1/2(2A)nr(A  +  1/2) 


(n  +  A)n!T(A) 

where  we  use  here  and  later  the  standard  notation 

(a)0  :=  0,  (a)n  :=  a(a  +  1)  .  .  .  (a  +  n  —  1)  =  T(a  +  n)/T(a). 
Also,  we  have 

(2A)„ 


(2.1) 


c„'(-<)  =  (-i )”c,)(«),  cA( i)  = 


n\ 


and  C'o'(t)  =  1. 


(2.2) 


The  Gegenbauer  polynomials  can  also  be  defined  by  the  following  identity  (called 
Rodrigues’  formula): 


<?n(*)  =  (-!)"«», A(l-0 


2\-A+J 


s)  (1  -*2)”+H 


C^n,\  • 


(2A)t 


n!  2n(A  +  |)s 


,  (2.3) 


There  is  an  identity  that  relates  Gegenbauer  polynomials  with  different  weights: 


T)  C.)(i)=2”'(A)„,C.y.:(i),  m  =  1,2,...,  n. 


(2.4) 


Special  cases  of  the  Gegenbauer  polynomials  are  the  Legendre  polynomials  Pn 
and  the  Chebyshev  polynomials  of  second  kind  Un  which  correspond  to  A  =  1/2  and 
A  =  1,  respectively.  Namely, 


Pn(t)  :  = 


(-1)"  ( d 


2  nn\ 


dt 


a  -  er  =  c'j2(t), 


UJf)  ;=  Sio(„+_^ccosi  =  c,;(^ 

The  Chebyshev  polynomials  of  the  first  kind  Tn(t)  :=  cos  n  arccos  t  can  be  considered 
as  the  Gegenbauer  polynomials  G°  associated  with  the  weight  w0(t)  =  (1  —  t2)-1/2. 

We  shall  also  need  the  Gegenbauer  polynomials  (7/  when  A  <  0  and,  in  par¬ 
ticular,  when  A  =  —  1,  —2, .  .  .  Note  that  oyy  =  0  when  A  =  —  1,  —2, .  .  .  and  n  >  2v. 
Therefore,  we  cannot  use  (2.3)  to  define  C~v  when  v  =  1,2,...  However,  we  can 
define  (see  [Sz,  Chapter  IV  ]) 


/  7  \  n 

C„'(i)  :=  q(1  -  i2)-A+i  (-)  (1  -  C)"+A-i,  A  <  0, 


(2.5) 


where  a  is  any  constant  independent  of  t.  To  our  goals  the  normalization  of  (7/ 
(A  <  0)  is  not  essential.  Identity  (2.4)  remains  valid  except  for  a  constant  factor  (see 
[Sz,  Chapter  IV  ]):  for  any  A,  we  have 

(j\  N(i)  =  aw;;(i),  m  =  i,2,...,„, 
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(2.6) 


where  c  is  independent  of  t. 

The  Gegenbauer  polynomials  play  a  fundamental  role  in  inverting  the  Radon 
transform.  We  shall  show  in  Lemma  2.1  that  follows  that  the  Gegenbauer  polynomials 
C f  for  A  =  k  and  A  =  k  +  1/2  (k  an  integer)  are  eigenfunctions  for  certain  differential 
operators  that  occur  in  the  Radon  transform  inversion  formula.  These  operators  will 
play  an  important  role  in  defining  an  equivalent  norm  for  the  weighted  Sobolev  spaces 
Ws(T2(fiw))  (see  §8). 

We  begin  with  a  brief  discussion  of  the  Hilbert  transform  H  on  R  and  its 
analogue  H  for  the  interval  /  :=  [—1,1],  For  any  g  £  Li(I)  we  define 

HS:=HS>  with  Jefloo.ooJXJ,  C2'7> 


where  Hg°  is  the  Hilbert  transform  of  g0.  It  follows  that 

tt  ^  1  f  9*(s)  j  1  f  9(s)  , 

HS(t)  =  -p.v.  yRi  —Js  =  -p.v.  Jr — ds. 


The  analogue  of  the  Hilbert  transform  on  the  circle  T  is  the  conjugate  operator 
(see  [Z,  Chapter  II]).  If  g  £  Ti(T),  we  denote  its  conjugate  function  by 

g(T )  :=  ^~p-v-  JT  g(e) cot  de- 

For  any  (nonnegative)  weight  function  w,  let  L^(I,w)  be  the  space  of  all  g  £ 
T2(/,  w)  with  weighted  mean  value  zero:  ff  git)  w(t)dt  =  0.  The  following  proposition 
gives  some  properties  of  H  which  we  shall  use. 


Proposition  2.1  If  g  £  we  define  T g(0)  :=  sgn  9g( cos  0)  sin  6  for  6  £  [— tt,  tt). 

The  Hilbert  transform  H  satisfies: 

(a)  If  g  £  Li(I)  then 


Hg(cos  r 


1 

sin  r 


Tg(T ) 


a.e.  on  (0,  tt). 


(2.8) 


(b)  We  have ,  on  (  —  1, 1),  Huq  1  =  0, 


Hfuq  1Tn+1\  =  -Un  and  Ufw-JJn]  =  Tn+1  for  n  =  0,1,...,  (2.9) 


and  hence 


H^K  Un\ 
dt 

(c)  The  functions  Vn  :=  n 

complete  orthogonal  system  for  T2(I7Wi). 


(n  +  1  )Un.  (2.10) 

0, 1, .  .  .  (in  analogy  to  {Un^^o)  form  a 
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(d)  H  is  a  one-to-one  mapping  of  L^I^Wi)  onto  L2(I}Wi)  with 
H_1h  = - H(u;i/i)  for  h  £  L2(Lw1) 

W\ 


and 


IIH5,||l2(/^i)  =  \\9\\l2{i,Wi)  for  g  £  L°2{I,w1).  (2.11) 

(e)  The  operators  H  and  ^  commute:  for  any  polynomial  P ,  we  have 

Proof,  (a)  We  apply  the  substitution  s  =  cos  6  to  the  integral  that  defines  H g  and 
replace  t  by  cos  r,  0  <  r  <  7r  and  obtain 


TT ,  .  i  r  Tn(») 

Hs(cost)  =  -p.v.  /  - - 

7 r  Jo  cos  r  —  cos  9 

1 

=  — p.v.  / 

2tT  J  —  7r  cos  r  —  cos  0 


de 


Tg(0 ) 


de , 


since  the  integrand  is  even.  Note  that  p.v.  ff  .  .  .ds  =  p.v.  ff  .  .  .  d0  above  since  the 
substituting  function  and  its  inverse  are  smooth  enough.  Now,  we  use  the  identity 


1 


1 


cos  r  —  COS  9 


2  sin  r 


t  —  9  t  +  9 
cot - b  cot - 


to  obtain 
Hg(cos  t] 


1 


2  sin  r 


— p.v. 
2tt 


t  —  9  1 

T g(9)  cot - d9  -\ - p.v. 

2  2tt 


t  - \-  0 

Tg(0)  cot  d9 


After  substituting  9  =  —9'  in  the  second  integral  above  and  using  that  Tg  is  even, 
we  see  that  the  two  integrals  are  equal  and  therefore,  we  obtain  (a). 

(b)  For  any  function  g  £  Li(I ^wf1),  we  have  T[wfx g](9)  =  g(cos9).  Since  the 
conjugate  function  of  cos  n9  is  sin  n9}  n  =  0,1,...,  the  first  two  statements  in  (b) 
follow  from  (a).  Similar  calculations  give  the  last  two  statements. 

(c)  This  is  trivial. 

(d)  This  follows  from  (b)  by  using  the  two  bases  for  L2(I7Wi)  given  in  (c). 

(e)  This  follows  from  (2.10).  □ 

We  shall  next  show  that  the  Gegenbauer  polynomials  are  eigenfunctions  of  cer¬ 
tain  differential  operators  that  arise  in  inverting  the  Radon  transform.  For  functions 
g  defined  on  Bd,  we  introduce  the  following  differential  operators: 


and 


/  d' 

.  d— 1 

\  r  1 

(2.12) 

As  :=  u. 

I  [Wdi2g\ , 

A,  d  odd, 

T>  :=  HA,  d  even. 

(2,13) 
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Lemma  2.1  Let  d  >  2  and  define  Un  :=  C dJ2  for  n  =  0, 1, . .  ..  Then,  we  have 

VUn  =  {-l)[^]iinUn,  n  =  0,1,...,  (2.14) 

and 

A2Un  =  (-l)d-1n2nUn,  n  =  0,1,...,  (2.15) 

where 

jin  =  (n  +  l)d-i  x  nd_1,  ra  =  0, 1, -  (2.16) 

Proof.  We  first  consider  (2.14)  in  the  case  when  d  is  odd,  d  =  2k  +  1.  From  (2.3) 
and  (2.4),  we  find 


By  examining  the  coefficients  of  tn  we  obtain  that  c2  =  (  —  l)k^n  Thus  (2.14)  is  proved 
in  this  case. 

Assume  now  that  d  is  even,  d  =:  2k.  Then,  again  using  (2.3),  (2.4),  and  (2.10) 
(recall  that  (7*  =  Un)  and  the  commutativity  of  jL  and  H,  we  obtain 


We  can  calculate  c3  as  follows.  Let  Ck(t)  =:  cntn  +  .  .  .  and  Ur(t)  =:  artr  +  .  .  . 
with  r  :=  n  +  2k  —  2.  We  fold 

/  ,  \  2k  — 1  /  ,  \  2fc-l 

(-)  H  hf-'c-yo]  =  c„  (-)  H  [*,(«)  ((-1  )‘-1r+a-2  + 

/  1  \  2k— 2  r  , 

=  (-1)*"1-  hr  H-  K(t)ffi„+fc-2(t)  +  . . .] 

ar  \dt J  [  dt 

/  ,  \  2k  — 2 

=  (  —  l)fc  1  —  (n  +  2&  —  1)  (  —  J  [Th+2fc-2(f)  +  •  •  •] 

ar  yary 

/  ,  \  2k  — 2 

=  (  —  l)k~1cn(n  +  2k-l  )  I  -  J  (fo+2fc-2  +  . . .) 

=  (  —  l)fc  1  (n  +  l)2fc-icn  (fo  +  .  .  .)  =  (  —  l)k  1  (n  +  l)2fc-iC'^(t), 
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where  we  used  identities  (2.3),  (2.4),  and  (2.10).  Thus  (2.14)  is  proved  in  this  case  as 
well. 

Finally,  we  consider  (2.15).  From  (2.3)  and  (2.5),  respectively,  we  have 

/  7  \  n+d—l 

AUn  =  kCdJ2  =  c[j\  [(i-^-M/2-1/2] 

7  \  n-\-d— 1 

M  ^1_t2ya+d-l^_t2yd/2  +  l/2^ 

=  c,(i  -  tr,,/2+U2ck'lV ■ 

Hence,  applying  A  once  again  and  using  (2.6)  gives 

*’«.  =  A’C*"  =  c,  (i)  C'AT  =  C2cf/2  =  C,M,„ 

By  calculating  the  coefficients  of  tn,  we  hnd  c2  =  (  — 1  )<i_1/W^  and  we  arrive  at  (2.15). 

□ 


3  An  orthogonal  decomposition  of  in  terms 

of  ridge  polynomials 


Since  we  are  interested  in  approximating  functions  /  £  L2(Bd)  by  elements  from 
spaces  of  ridge  functions,  it  is  natural  to  hnd  a  decomposition  of  /  in  terms  of  funda¬ 
mental  building  blocks  of  ridge  functions.  We  shall  show  in  this  section  that  we  can 
take  as  the  building  blocks  certain  ridge  polynomials.  We  begin  by  describing  this 
decomposition. 

If  f,9  £  L2(Bd),  we  define  the  inner  product 


(f,9)  ■=  f  /(x)^(x)dx. 
J  Bd 


This  inner  product  induces  the  norm 


l2( Bd)  ■=  \  lBd  l/(x)|2dx 


1/2 


We  also  define,  for  /, g  £  L2(Sd  x),  the  inner  product 


and  the  norm 


(/,m=  /s„_, 


M Sd-T  :=  (  /gd_1  1/(0 12  ^ 


1/2 


(3.1) 


(3.2) 
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where  d£  stands  for  the  area  (volume)  element  on  Sd_1  the  unit  sphere  in  Rd. 

The  Gegenbauer  polynomials  C dJ2  are  the  building  blocks  for  our  decomposition. 
Let 

Un  :=  (hn4/2y1/2  CdJ\  n  =  0,1,...,  (3.3) 

where  hn^/ 2  is  from  (2.1).  Then  ||£G||n2 (i,w)  =  1  and  hence  is  a  complete 

orthonormal  system  for  the  weighted  space  L2(I,  w),  w(t)  :=  Wd/2(t)  =  (1— 12)~ .  Of 
course,  Un  depends  on  the  space  dimension  d  but  we  are  suppressing  this  dependence 
in  our  notation.  The  reader  should  think  of  the  space  dimension  d  as  arbitrary  but 
fixed  throughout. 

Let  Vn  denote  the  set  of  all  algebraic  polynomials  of  total  degree  n  in  d  real  vari¬ 
ables.  That  is,  each  P  £  Vn  is  a  linear  combination  of  monomials  xm  :=  x™1  .  .  .  x™d 
with  x  :=  (xi, .  .  .  ,  Xd)i  m  is  a  d-tuple  (mi,...,md)  of  nonnegative  integers,  and 
|m|  :=  nil  +  .  .  .  +  nid  <  n. 

The  polynomials  Un(£  ■  x),  £  £  Sd_1,  are  in  Vn  and  Un(£  ■  x)  are  orthogonal  to 
Vn-\  in  L2(Gd)  (proved  in  the  appendix): 
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(i)  For  each  n  =  0, 1, .  .  the  function  Qn(f )  is  an  algebraic  polynomial  (in  d 
variables)  of  degree  n.  Indeed,  each  of  the  Un{^  ■  x)  is  a  ridge  polynomial  of  degree  n 
and  Qn(f )  is  a  linear  combination  of  these. 

(ii)  For  each  n  =  0,  1, .  .  the  function  An(f,£),  £  £  Sd_1,  is  a  spherical  polyno¬ 
mial  of  degree  n.  This  follows  from  the  fact  that  each  of  the  Un(£  ■  x),  x  £  Bd,  is  of 
this  type. 

(iii)  The  constants  n  =  0, 1, .  .  are  eigenvalues  which  occur  in  the  Radon 
inversion  formula  (see  (3.26)). 

(iv)  Among  other  reasons,  the  polynomials  Un  occur  in  this  formula  because  for 
each  £  £  Sd_1  the  weight  w^/2(t)  =  (1  —  £2)(d-1)/2  is  a  constant  multiple  of  the  d  —  1 
dimensional  volume  of  the  intersection  of  Bd  with  the  hyperplane  x  •  £  =  t. 

(v)  The  orthogonality  of  the  functions  Qn(f )  occurs  because  for  each  £  £  Sd_1, 
the  polynomial  Un(x.  ■  £)  is  orthogonal  to  all  algebraic  polynomials  of  degree  <  n  on 
Bd  (see  (3.4)). 

(vi)  One  can  imagine  that  the  integral  representation  of  Qn(f )  can  be  rewritten 
as  a  discrete  sum  by  using  some  sort  of  quadrature  formula  on  Sd_1  and  thereby  obtain 
a  discrete  decomposition  of  /  in  terms  of  ridge  polynomials.  In  the  case  d  =  2,  one  can 
simply  take  the  canonical  quadrature  formula  for  integrating  spherical  polynomials 
(i.e.  trigonometric  polynomials)  which  uses  equally  spaced  points  on  the  unit  circle. 
This  then  gives  the  orthonormal  system  {Un{u  •  x)},  u  £  On,  n  =  0,1,...,  where 
Qn  :=  {(cos  &7r/n,  sin  kir /n)}^=1.  This  was  used  in  [DOP]  as  the  vehicle  for  proving 
approximation  results  for  ridge  functions  in  two  variables.  In  the  case  d  >  3,  we  know 
no  analogous  quadrature  formula.  This  necessitates  a  substantial  effort  (executed  in 
the  following  section)  to  derive  (less  elegant)  quadrature  formulas  which  can  be  used 
to  discretize  the  integral  representation  of  Qn(f)- 

(vii)  The  decomposition  of  Theorem  3.1  is  in  essence  known  (see,  e.g.,  [RK]). 
However,  we  could  find  no  reference  which  gives  it  in  the  above  form. 

There  are  several  ways  in  which  the  decomposition  of  Theorem  3.1  can  be  de¬ 
rived.  One  approach  is  to  derive  it  from  the  theory  of  spherical  harmonics.  A  second 
approach  is  Radon  transforms  and  in  particular  (3.5)  is  a  rewriting  of  the  Radon 
inversion  formula  (see  [RK]).  We  shall  briefly  explain  this  at  the  end  of  this  section. 

We  shall  give  a  simple  and  direct  proof  of  this  decomposition  using  fundamental 
identities  for  the  ridge  polynomials  Un(£  ■  x),  £  £  Sd_1.  To  keep  our  exposition  more 
fluid,  we  shall  state  these  identities  without  proof  and  relegate  the  proofs  to  the 
appendix. 

We  start  with  the  following  two  fundamental  identities  (proved  in  the  appendix): 
for  each  £,  rj  £  Sd_1,  we  have 

j^lL(i-*)Un(ri-x)dx=  (3-10) 
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and,  for  each  rj  £  Sd  1,  we  have 

[  Un(£  ■  -x.)Un{i  -r))d£  =  Un^UM  •  x)-  (3.11) 

J Sd-J  vn 

Proof  of  Theorem  3.1.  Let  /  £  L2(Bd),  d  >  2.  From  Remark  (i)  and  (3.4), 
it  follows  that  Qn(f )  is  in  Tb  0  Vn-\.  From  identities  (3.10)  and  (3.11),  we  have 
Quid)  =  9  whenever  g(x)  =  Un(r)  ■  x),  rj  £  Sd_1.  Therefore  Q\  =  4G  and  hence  Qn  is 
a  projector  onto  a  subspace  Yn  of  Vn  ©  Vn-\.  Thus,  to  prove  (3.5),  it  remains  only 
to  show  that 

dim (Yn)  =  dim(Th  QVn-i)  =  dim(P^),  (3.12) 

where  denotes  the  space  of  all  homogeneous  polynomials  of  degree  n. 

To  prove  (3.12),  we  recall  a  few  well-known  facts  about  spherical  harmonics 
which  can  be  found  in  Stein  and  Weiss  [SW],  Chapter  4,  see  also  [Se] .  Let  7in  denote 
the  space  of  spherical  harmonics  of  degree  n,  i.e.  7in  is  the  set  of  those  functions  on 
Sd_1  which  are  the  restriction  to  Sd_1  of  a  function  from  which  is  harmonic  in 
Bd.  The  spherical  harmonics  of  degree  n  are  orthogonal  to  those  of  dimension  m  ^  n 
with  respect  to  the  inner  product  (3.2).  We  have 

dim  (nn)  =  N(d,n)  :=  (H  + ^  (3-13) 


and 

dim  (Vn)  =  dim  ( Hn  0  7 in-2  0  •  •  •  0  He),  (3-14) 


where  e  =  0  if  n  is  even  and  e  : 
Write 

Kn(t)  : 


1  if  n  is  odd. 


N(d,  n) 

sd-l|dd“2)/2(i) 


Cf~^l2(t) 


1 


where  \Sd  1|  :=  fSd- 1  1  d£  =  ^(d/2)  surface  area  of  Sd  1 .  The  function  Kn(^  •  rj) 

is  the  reproducing  kernel  for  Tin,  i.e. 


smn(t  •  riw  =  m,  s 

.  Sd-l 


£  Ti  n 


(3.15) 


Moreover,  a  simple  identity  for  Gegenbauer  polynomials  (see  the  appendix  (A3))  gives 
that 


Kn  +  Kn- 2  +  .  .  .  +  Kt  — 


Cd! 2 


un(iy 

Hence,  the  right  side  of  (3.16)  is  the  reproducing  kernel  for  7in  0  7C-2  0 


Unit  ■  ri)d(  =  S(rj ),  S  £  Tin  0  Hn-2  0  •  •  •  0  He 


(3.16) 
0  7ie,  i.e. 

(3.17) 
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Note  that  An(f,£)  is  a  spherical  polynomial  of  degree  n  since  Un{^  ■  y)  is  a 
spherical  polynomial  of  degree  n  in  £.  We  have  I4n(—t)  =  (  — 1  )nUn(t)  (see  (2.2))  and 
hence  An(f,  -£)  =  (-1  )Mn(/,£)-  Therefore,  An(f)  £  7in  ®  Hn-2  ©  •  •  •  ©  Ht.  Thus, 
Qn  can  be  considered  as  a  linear  operator  mapping  7in  ©  7in- 2  ©  •  •  •  ©  74e  into  Yn. 
On  the  other  hand,  after  multiplying  both  sides  of  (3.6)  by  Un(r\  ■  x)  and  integrating 
over  Bd  we  obtain 

f  Qn(f,x)Un(r]  •  x)  dx  =  f  An(f,£)(vn[  Un{r)  ■  x)Un(£  •  x)  dx) 

J  Bd  J  S'2-1  V  J  Bd  / 

=  L  Mf' c }idT)UJ’1  '()d(= AJM 

where  we  used  (3.10)  and  (3.17).  Hence,  An  is  an  operator  mapping  Yn  onto  7 in  © 
7in-2  ©  •  •  •  ©  Tie  and  it  is  the  inverse  operator  of  Qn.  Therefore,  dim  (Yn)  =  dim  (7 in  © 
'Hn-2  ©  •  •  •  ©  7 it)  which  together  with  (3.14)  implies  (3.12). 

Since  Qn(f )  is  in  Vn  ©  Vn-i,  it  is  orthogonal  to  Qj(f),  j  ^  n,  and  therefore  we 
have  the  first  equality  in  (3.9).  For  the  proof  of  the  second  equality  in  (3.9),  we  use 
(3.10)  to  write 


/  Qn(/,x)2dx  =  v2n  /  /  An(f,£)An(f,ri)Un(Z-x.)Un(ri  ■x)dx.dtdri 

J  Bd  J  S^1  J  S'2-1  J  Bd 

=  L-,  /s.,-,  Mf-OAJJ.  in- 


Since  An(f)  £  7 in  ©  din-2  ©  •  •  •  ©  7 then  we  can  use  (3.17)  to  complete  the  integral 
with  respect  to  r]  above.  We  get 

/  Qn(f,x)2dx  =  vn  I  An(f,Z)2d£. 

J  Bd  JSi-1 

This  completes  the  proof  of  (3.9)  and  the  theorem.  □ 

In  the  same  way  that  we  have  proved  (3.9)  of  Theorem  3.1  we  obtain  the  fol¬ 
lowing  formulas  for  calculating  inner  products: 

OO  * 

(/w)  =  XwW  An{f,()An{g,()di-  (3.18) 

‘  *  ICd— 1 

n= 0  J  S 

We  next  consider  the  decomposition  (3.5)  for  ridge  functions.  Let  r  be  a  uni¬ 
variate  function  in  L2(I}w)}  w  :=  w<i/2.  Then 

OO 

r(t)  =  r(n)Un(t),  r(n )  :  = 

n— 0 

It  follows  that  for  any  p  £  Sd_1,  the  ridge  function  i?(x)  :=  r(p  ■  x)  has  the  represen¬ 
tation 

CO 

i?(x)  =  r{n)Un(p  •  x).  (3.20) 

n=0 


r(t)Un(t)w(t)  dt. 


(3.19) 
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Using  (3.10)  and  (3.4),  we  see  that 

=  (3'21) 

Moreover,  if  i?i  and  R2  are  two  such  ridge  functions  corresponding  to  r i,  pi  and  r2, 
p2}  respectively,  then  from  (3.4),  (3.11)  and  (3.18),  we  have 

(i?i,  i?2)  =  £  r1(n)r2(n)Unjj1(  f*2\  (3.22) 

n= 0  Un{l) 

There  is  another  approach  to  deducing  the  decomposition  of  Theorem  3.1  which 
we  want  to  mention  since  it  brings  out  the  connections  between  this  paper  and  Radon 
transforms.  For  each  /  £  Li(Bd)  the  Radon  transform  is  defined  by 

:=  f  J(t£  +  y)dy,  (3.23) 

Jc1-  p|  Bd 

where  £  £  Sd_1,  t  £  [—1, 1],  and  :=  {y  £  Rd  :  y  •  £  =  0}.  So,  the  integration  is 
over  the  intersection  of  the  hyperplane  y  •  £  =  t  and  Bd. 

We  can  recover  /  from  its  Radon  transform  by  using  the  Radon  transform 
inversion  formula.  The  Radon  transform  inversion  formula  uses  the  operator  (see  e.g 

[LD 

feV-i  ( i)d  1  9  (t)  ford  odd 

Kg(t)  :=  Ktg(t )  :=  <  *  (3.24) 

,  2(2W-iH  1  9(t)  ford  even, 

where  H  is  the  Hilbert  transform  (see  (2.7)).  The  following  relation  is  the  Radon 
inversion  formula  for  functions  defined  on  Bd:  for  every  sufficiently  smooth  function 
/  supported  on  Bd 

/(x)  =  /  with  MCU)  :=  KtK(f-,Z,t).  (3.25) 

43d-1 

Lemma  2.1  gives  that  the  functions  Un  are  eigenfunction  for  the  operator  K(w): 

K{wUn)  =  vnUn.  (3.26) 

We  now  show  the  idea  of  using  the  Radon  inversion  formula  to  derive  a  rep¬ 
resentation  of  /  in  terms  of  the  ridge  polynomials  {77„(x  •  £)}.  Since  is  a 

complete  orthonormal  system  for  L2(I}w)}  we  can  expand  7 Z(f  ]£,•)/ w  in  terms  of 
the  {Un}^=Q  to  obtain 

UU’M  =  (3.27) 

w  n 

ra=0 
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with 


An(i)  :=  f  H(f-,t,t)Un(t)dt=  I  f{y)Un{y  ■  ()dy.  (3.28) 

Jl  J  Bd 

After  multiplying  both  sides  of  (3.27)  by  the  weight  w  and  applying  the  operator 
K  :=  Kt  we  get 


KK(fU,*)  =  Y.An(()K[™Un]  =  J2^An(C)Un, 

n— 0  n— 0 

where  we  used  (3.26).  Finally,  we  insert  the  above  in  (3.25)  and  find 

p  OO  n 

f(x)  =  KR(f-.(.x-C)d(  =  /UW,(x-{)<i£ 

n= 0  JS 

which  is  the  decomposition  from  Theorem  3.1.  We  leave  the  details  of  verifying  this 
approach  to  the  reader. 


4  Discrete  representation  of  functions  and  norms 

In  this  section  we  shall  deduce  from  Theorem  3.1  a  discrete  representation  of  functions 
by  ridge  polynomials.  To  this  end  we  shall  use  a  cubature  formula  for  integration  on 
Sd_1,  d  >  2.  We  need  a  cubature  formula  that  is  exact  for  all  spherical  polynomials 
of  degree  n.  In  the  case  d  =  2  we  used  in  [DOP]  a  quadrature  formula  with  equally 
spaced  nodes  on  the  unit  circle.  Unfortunately,  we  do  not  know  any  ’’equally  spaced 
points”  on  Sd_1,  d  >  2.  Also,  we  do  not  know  effectively  any  cubature  formula  with 
near  equally  spaced  nodes  on  Sd_1 .  For  this  reason  we  shall  use  a  cubature  formula, 
determined  by  using  spherical  coordinates  on  Sd_1.  The  results  of  this  section  are 
somewhat  technical  and  the  reader  may  just  wish  to  read  them  roughly  at  first  and 
proceed  to  §5. 

The  spherical  coordinates  (9,(f>)  :=  (9\,  02, .  .  . ,  Od-2,  <fi)  on  Sd_1  are  defined  as 
usual  by 

6  =  cos  6*i ,  £2  =  sin  6*1  cos  6*2 ,  . .  .  ,  2  =  sin  6*1  sin  02  .  .  .  sin  6*d_3  cos  6*d_2 , 

£d- 1  =  sin  6*1  sin  02  .  .  .  sin  6*d_3  sin  6*d_2  cos  <j>,  ^  =  sin  6*1  sin  02  .  . .  sin  6*d_3  sin  6*d_2  sin  <j>, 

0  <  9j  <  7r,  j  =  1,  2, .  .  .  ,  d  —  2;  0  <  <(>  <  2tt.  We  shall  denote  these  identities  in 
vector  form  briefly  by  £  :=  £(0,<^>).  In  these  coordinates,  the  surface  element  d £  of 
Sd_1  becomes 

di  =  (sin  0i)d_2(sin  02)d~3  .  .  .  sin  Od-2  d9\  dd2  .  .  .  d0 d-2  d<p  =:  J(0)  d9  d<f).  (4-1) 

We  have  the  following  identity  for  integration  in  spherical  coordinates 

/  HO  d£=  r  ...  r  r  me,  <P))J(9)  d9\  . . .  dOd_2  d<f>,  (4.2) 

J  S'2-1  Jo  Jo  Jo 
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where  J(6)  is  the  Jacobian  given  by  (4.1).  We  shall  use  this  to  define  our  cubature. 

We  wish  to  construct  a  cubature  that  is  exact  for  all  spherical  polynomials  of 
degree  2 n.  Every  spherical  polynomial  of  degree  2 n  can  obviously  be  represented  in 
spherical  coordinates  as  a  linear  combination  of  terms 

d—2 

(cos  </>)fcd_1  (sin  (f))* ’d_1  Jd  (cos  0f)kj  (sin  0j)l] 

3  =  i 

where  kj,£j  >  0,  and  ma x{kj-\-£j  :  j  =  1,  2, .  .  . ,  d  —  1}  <  2 (d  —  1  )n.  Also,  the  Jacobian 
J  is  represented  in  the  same  terms  (see  (4.1)).  So,  we  need  quadrature  formulae  for 
integration  over  [0,  2tt]  and  [0, 7r]  that  are  exact  for  trigonometric  polynomial  of  degree 
2  (d  —  1  )n  +  d  —  2. 

We  shall  use  the  following  quadrature  formula  for  integration  on  [0,  2tt]  with 
respect  to  <f 

fk  ^  /*27r 

Q<f>,k(g)  '■=  J2  ej9(ij)  ~  /  5,(^)#,  (4.4) 

3=0  Jo 

where  ^yJ  :=  and  :=  .  The  quadrature  (4.4)  is  exact  for  all 

trigonometric  polynomials  of  degree  k  (see  [Z],  Chapter  X). 

Since  0  <  <  7r,  we  need  a  quadrature  for  integration  over  [0, 7r]  that  is  exact 

for  all  trigonometric  polynomials  of  degree  k.  In  addition  to  this,  the  quadrature 
should  have  good  localization  properties.  We  also  need  to  control  (asymptotically) 
the  nodes  and  the  coefficients  of  the  quadrature.  Since  we  do  not  know  any  quadrature 
like  this,  we  shall  construct  one  in  the  following  lemma. 

Lemma  4.1  For  any  k  =  1,2,...  there  exists  a  quadrature 

2k 

Qe,k(g)  =  J2>‘j9(Pj)  ~ 

3=0 


(4.3) 


with  the  following  properties: 

(a)  Qe,k(g )  is  exact  for  all  trigonometric  polynomials  of  degree  k; 

(b)  0  <  /50  <  /3X  <  . . .  <  [32k  <  7T; 

(3j  —  /3j_i  <  irk -1,  j  =  0, 1, . . . ,  2k  +  1;  (4-6) 

c 

0  <  <  c  (pj+1  -  fd^)  ,  j  =  0, 1, . . . ,  2k,  (4.7) 

where  /3_1  :=  0,  fd2k+ 1  :=  71  an> ^  c  an  absolute  constant. 

The  exact  values  of  the  nodes  (3j  and  the  coefficients  of  the  quadrature  (4.5) 
are  given  in  Remark  4.1  below. 
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Proof.  For  symmetry  reasons  we  shall  prove  the  lemma  with  the  interval  of  integra¬ 
tion  [0,7r]  replaced  by  [ — 7t/2,  7t/2].  We  shall  build  a  quadrature 

Qk(g)=  E  X]9(ei)  ~  /  g(9)d9  (4.8) 

j=-k  J~^2 

with  symmetric  nodes  6j  and  coefficients  Aj  (0~j  =  6j,  do  :=  0,  and  A_j  =  Aj).  Then 
Qk(g)  will  be  automatically  exact  for  odd  polynomials.  Therefore,  it  is  enough  to 
construct  Qk(g)  exact  only  for  all  even  trigonometric  polynomials  of  degree  k.  To 
this  end  it  is  sufficient  to  have 

Q k(P (cos  O  j)  :=  XjP(cos0j)=  [  P(cosO)dO  =  2  f  P(cos0)d0  (4.9) 

j=_k  J-tt/2  Jo 

for  each  algebraic  polynomial  P  of  degree  k.  We  shall  apply  the  substitution  9  :  = 
9(a)  :=  arccos  (cos2  j)  to  the  last  integral  in  (4.9).  Simple  calculations  show  that 

si  cos  — 

A(a)  :=  -9(a)  =  2  (4.10) 

da  J 1  -f  cos2  - 


and  hence  9(a)  is  increasing  on  [0,7r]  and  maps  [0,7r]  on  [0 ,  7r/2] .  We  obtain 


I  P(cos0)d0  =  J  P  ^cos2  —  ^  A(a)  da  =  —  J  P  ^cos2  —  )  A(a)  da,  (4.11) 


where  we  used  that  the  integrand  is  even.  We  now  extend  A(a)  27r-periodically  by 
A(a)  :=  |  cos  j ^j\  +  cos2  ^  . 

We  shall  use  the  Dirichlet  kernel  Dk(u)  :=  to  interpolate  the  trigono¬ 

metric  polynomial  of  degree  m:  P  (cos2  =  P  (1+c2°SQf^  at  the  points  a3  :=  22 ^  ,  j  = 
0,  ±1, .  .  .  ,  ±&.  We  have  (see  [Z,  Chapter  X]) 


P  (  cos2 


a 


STT  )  />*(«-«,). 


This  and  (4.11)  imply 

r7r/2 


/  P( cos  a)  da  = 
Jo 


E  p 


2k  +  l  j=_k 


cos 


A (a)Dk(a  —  aj)  da 


=  EffiP(cos2W 

J=-k  v  z 


(4.12) 


=  rioP  (cos2  +  ^2r)jP 
Z  7  j=i 


cos 


27’ 
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where 


(4.13) 


1 


Vj  ■  = 


A (a)Dk(a  —  aj)  da 


2k  +  1  J-?, 

and  we  used  that  rj-j  =  rjj  since  A  is  even  and  a_j  =  —ar 

We  now  define  the  nodes  and  the  coefficients  of  our  quadrature.  Set 


a 

9j  :  =  arccos  (^cos2  A-J  for  j  =  0, 1, . .  .  ,  k  and  0j  :=  —  0_j  for  j  =  —1,  —2, .  .  .  ,  —k. 
Also,  set 

Xj  :=  2 rjj  for  j  =  0, 1, .  .  .  ,  k  and  Xj  :=  A_j  for  j  =  —1,  —2, . .  .  ,  —k. 

We  obtain  by  (4.9),  (4.12)  and  the  symmetry  that  quadrature  (4.8)  with  the 
above  defined  nodes  and  coefficients  is  exact  for  all  trigonometric  polynomials  of 
degree  to.  It  remains  to  prove  that  the  nodes  and  the  coefficients  of  the  quadrature 
satisfy  the  required  properties. 

We  have  6j  —  6j- 1  =  6'((j)(aj  —  ctq_ i)  =  A((j)(ctq  —  aq_i)  for  some  Q  G 
(ci:j_i,Q:j)  and  hence,  by  (4.10), 

1  /  2?r  „  /  a, \  2tt  tt  , 

7/2  V  ~2J  n1-  '  1  ~  °3  ~  9]~1  ~  (cos  ol  ,  i  <  T’  J  =  (4.14) 


r-  ,  ,  <  6;  —  On- 1  <  COS  -  .  ^  , 

^2V  2  7  2A;  +  1  ~  J  3  \  2  J  2k  +  l  k 

Therefore,  the  proof  will  be  completed  if  we  show  that 


0  <  Tjj  <  ck  cos  — ,  j  =  1,2, 


(4.15) 


By  (4.13)  it  follows  that 


r]j  =  ir(2k +  1)  1Sk(A)(aJ),  where  Afc(A)(a)  :=  4  A(f3)Dk(f3  -  a)  d/d 

is  the  &th  partial  Fourier  sum  of  A.  In  order  to  simplify  our  further  calculations  we 
shift  A  by  tt  and  obtain 


<p(ct)  :=  A(a  +  7r)  = 


.  a 

sm  — 

2 


a 


1  +  sin2  —  . 


The  function  <p  is  even  and,  therefore,  its  Fourier  coefficients  associated  with  sin  pa 
are  all  equal  to  zero.  Let 

/*27r  2_  /*2tt 

a0  :=  —  /  <f(a)  da  and  av  :=  —  /  <f(a)  cos  ua  da ,  v  =  1,2,... 

27T  Jo  7t  Jo 

be  the  Fourier  coefficients  of  <p  associated  with  cos  pa.  Obviously,  a0  >  0.  Let 
p  =  1,2,...  Then  using  integration  by  parts  (twice)  we  get 


1 


p2-7T 


av  = - /  (p'ict)  sin  pa  da 

7 tp  Jo 

1 


TTP* 


7 TP* 


[(G/(27t  )  —  (^>/(0+  )1 - -  f  <p"(a)  cos  pa  da 

L  J  7 TP2  JO 

/*27 r 

/  (G//(a)(l  —  cos  ra)  da. 

Jo 
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Therefore 


(4.16) 


(*27 r 


a„  = 


— -  f  ip  a  sm 

7T Vz  JO  2 


2  VOL  da. 


Simple  calculations  show  that 

/  1  a  I  (  .  a\3/2 

p  (a)  =  —  cos  —  /  (1  +  sm  — j  and  ( a )  <  0  tor  0 


<  a  <  2tt. 


This  and  (4.16)  imply  that  av  <  0  and 

9  /*27T  9  /*27T 

kl<  — /  = — j/  k(«) 

7T Z/2  JO  7T Z/2  JO 


da  =  —- 


7 rzw 


(^>/(27t  )  —  /(0+ 


2 

7TZ/2 


Thus,  we  have 

- —y  <  Oj/  <  0,  tv  =  1, 2, . . ..  (4.17) 

7Tl/Z 

Therefore  (£>(«)  =  a0  +  cos  va-,  where  a0  >  0  and  av  <  0,  v  =  1,2,...,  and 

hence 


k  k 

Sk(<p)(a)  =  a0  +  av  cos  pa  >  a0  +  ^  =  S7(^)(0) 

iJ  =  l  iJ  =  l 

OO 

=  sh(^)(o)  —  t^(o)  =  — 

iJ  =  A;  +  l 

Thus  S'fc((^>)(a)  >  0  for  a  £  [— 7r,7r)  and  hence  S7(A)(a)  >  0  for  a  £  [— 7r,7r)  which 
implies  the  lower  bound  in  (4.15). 

The  inequalities  (4.17)  imply 

||A  -  S7(A)||c  Hl^  - 


Using  this,  we  obtain 


\rij\  <  ck  1\Sk(A)(aJ)\  <  ck  1  (|A(a9|  +  || A  -  Ufc(A)||c) 

<  ck -1  ^COS  -d-  -f  —  C^_1  (^COS  cf  +  COS  —  C^_1  COS  7^5 

where  we  used  that  cos(cq,/2 )  =  cos  (Trk/(2k  +  1))  >  ck~x .  Thus  the  upper  estimate 
in  (4.15)  is  proved.  Lemma  4.1  is  proved.  □ 


Remark  4.1  The  exact  values  of  the  nodes  (3j  and  the  coefficients  X3  of  the  quadra¬ 
ture  (4.5)  from  Lemma  4.1  are  the  following: 

a  *  (2  (k~j)n\  ■  n  1  1 

p,;  = - arccos  cos  -  ,  7  =  U,  I, .  .  . ,  fc, 

'  3  2  V  2k  +  1  j  ’  J  ’  ’  ’  ’ 

and  (3j  =  7r  —  j  =  k  +  1,  k  +  2, . . . ,  2k; 
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A,  =  — - 

J  2k  +  1  „ 

Qjfld  X j  —  X2 k—j 

k. 


a  (  9  a\  1/2  (  2k  (k  —  j)  \ 

cos  -[1  +  cos  -)  Dkla - 2k +  1  J  da ’ 

j  =  k  +  1,  k  +  2, .  .  .  ,  2£g  where  Dk  is  the  Dirichlet  kernel  of  degree 


We  are  now  in  a  position  to  construct  our  cubature  formula  for  integration  over 
Sd_1  (d  >  2).  We  shall  use  (4.2),  (4.3)  and  the  quadratures  from  (4.4)  and  (4.5). 

Definition  of  cubature  Qn.  Given  n  =  1,2,...  we  select  k  :=  2 (d  —  l)n  +  d  —  2. 

Let  Jn  be  the  set  of  all  indices  j  :=  (j  1, .  .  .  ,jd-i)  such  that  0  <  jv  <  2 k,  i.e.  Jn  :  = 

{0, 1, .  .  .  ,  2£;}<i_1.  Note  that  the  cardinality  of  Jn  is  =  (2k  +  l)d_1  X  nd_1.  Set 
/3j  :=  (Ph,.  .  .  7j  :=  7^,  xj  :=  £(/3j,7j),  and  Aj  :=  J(^j)^d_1  ECt?  where 

7j,  Qj,  fdj,  and  Aj  are  the  nodes  and  the  coefficients  of  quadratures  (4.4)  and  (4.5), 
respectively,  and  J  is  from  (4.1).  We  define 

Q»(/)  :=  £  V("j)  ~  L,/(«X  (4.18) 

j  £Jn  ^ 

When  it  is  possible  we  shall  write  this  cubature  with  the  following  simpler  indices. 
Let  Qn  be  the  set  of  all  nodes  u  =  xj,  and  Aw  :=  A^  :=  Aj,  j  £  Jn.  Then  cubature 
(4.18)  can  be  rewritten  in  the  form 

Qn(/)  :=  £  A„/M  ~  /  f(()de  (4.19) 


Observe  that  #fln  X  nd~1 . 

As  we  mentioned  in  the  beginning  of  this  section,  every  spherical  polynomial  of 
degree  2 n  can  be  represented  in  spherical  coordinates  as  a  linear  combination  of  terms 
like  those  in  (4.3)  and  the  Jacobian  J  is  represented  in  a  similar  way  (see  (4.1)).  On 
the  other  hand,  quadratures  (4.4)  and  (4.5)  are  exact  for  trigonometric  polynomials 
of  degree  k  :=  2 (d  —  1  )n  +  d  —  2.  Therefore  (see  (4.2)),  cubature  (4.18)  (or  (4.19))  is 
exact  for  all  spherical  polynomials  of  degree  2 n,  i.e.  for  every  spherical  polynomial  S 
of  degree  <  2 n  we  have 

Q,(S)  :  =  £  A„5'M  =  L  ,  (4.20) 

,-0  JSd  1 

Lu£iln 

Note  that  Aw  >  0  and,  since  (4.20)  holds  for  S  =  1,  then 

£  V=  jf  ld(=:\Sd-l\-  (4.21) 
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Identity  (4.20)  implies  discrete  representations  of  the  projection  Qm(f )  of  any 
/  £  L2(Bd )  onto  Vm  0  Vm-i  and  ||  Am(f)  ||n2(Sd-i)  (see  (3.6)  and  (3.7)  from  Theo¬ 
rem  3.1).  Namely,  since  Am(f,  h,)Um(x.  •  £)  and  (/,£),  for  m  <  n,  are  spherical 
polynomials  of  degree  <  2 m  <  2 n,  then 

Qm (fi  x)  .  —  j  Am(f ,  £ fHm (x  •  £)  ^  (_/",  ioflAm (x  •  cu)  (4. 22) 

Jsd_1 

and 

II^MIz^s**-1)  :=  £d_!  =  X]  | Am(  f , u) | 2 .  (4.23) 

Since  quadratures  (4.4)  and  (4.5)  have  good  localization  properties,  then  cuba- 
ture  (4.18)  (or  (4.19))  has  such  properties.  We  shall  use  them  to  prove  the  following 
lemma. 

Lemma  4.2  Let  n  =  1,  2, . . m  =  1,  2, . . and  let ,  for  7  £  (0, 7r];  /Cm( cos  7)  :  = 
c0  minjm^-1,  rali_1/(ra7)li}  with  c0  >  0  a  constant.  Then,  we  have 

Qn(^Cm(«  •  ??))  <  c[l  +  (m/n)^1]  for  7  £  Sd_1 ,  (4.24) 

where  Qn  0  the  cubature  from  (4.19)  and  c  depends  only  on  d  and  c0. 

Proof.  In  what  follows,  we  shall  assume  that  n  >  n0,  where  n0  is  sufficiently  large 
and  depends  only  on  the  dimension  d.  Estimate  (4.24)  obviously  holds  for  n  <  n0 
by  (4.21).  We  first  construct  a  tiling  of  Sd_1  which  is  determined  by  the  nodes  of 
cubature  (4.18).  We  associate  with  each  node  cuj  the  spherical  box  (tile)  Tj  consisting 
of  all  points  £  £  Sd_1  for  which  £  =  £(0,  cf>)  with 

(M)  G  h'i,0i+ 1)  x  •••  x  Kd-2WJd-2+ 1)  x  [bJd_1,bJd_1+ 1), 

where  a3  :=  |(/3d  +  fj-i)  and  bj  :=  |(yj  +  7j_i)  with  f3j  from  (4.5)  and  7^  from  (4.4). 
Observe  that  0j  e  Tj  is  the  (spherical)  center  of  Tj.  Obviously  Tj  p|  Tj  =  0,  j  i,  and 
the  tiles  Tj  cover  Sd_1  excluding  small  regions  around  the  poles.  The  most  important 
property  of  our  cubature  is  that 

0  <  Aj  <  c  /  1  df  =  :  c|Tj|  for  j  £  Jn.  (4.25) 

jTj 

This  property  follows  readily  by  (4.7),  the  definition  of  7^  from  (4.4),  and  the  definition 
of  our  cubature  (see  (4.18)). 

The  second  important  property  of  our  tiling  is  that  the  diameter  of  each  tile  Tj 
is  <  cn-1.  We  let  7(^,7)  :  =  arccos  £  -7,  £,  7  £  Sd_1  denote  the  angular  distance  on 
Sd_1  (the  angle  between  vectors  £  and  7).  It  is  easily  seen  that  7(^,7)  satisfies  the 
axioms  for  a  distance  on  Sd_1.  Since  f3j  —  flj- 1  <  cn-1,  by  (4.6),  and  7^  —7^-1  <  cn_1, 
by  the  dehnition  of  7^,  then 

sup{p(C,  7)  =  £  Tj}  <  ^rC1,  (4-26) 
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where  c\  depends  only  on  d. 

Suppose  that  rj  £  Sd_1  is  hxed.  We  select  a  new  coordinate  system  such  that  rj  = 
e(  :=  (1,  0, .  .  .  ,  0)  is  its  first  coordinate  vector.  This  can  be  done  by  a  suitable  rotation 
of  the  old  coordinate  system.  For  £  £  Sd_1,  we  shall  denote  by  O'  :=  (0(,  .  .  .  }0'd_2) 
and  (f>'  the  new  spherical  coordinates  of  £. 

We  dehne,  for  v  =  1,  2, .  .  .  ,  n, 


zv  :=  Ues 


d— 1 


7 t(u  —  1) 


n 


<  p{£,e i)  < 


7T  V 


n 


n 


n 


and 


Z*  :=  (  £  £  Sd  1  :  max 


7 t(v  —  1)  —  Ci 


n 


]  /  a/  /  .  f  7TU  +  Cl  }  ] 

,  —7 r  >  <  ch  <  mm  I - ,  tt  >  > 


where  Ci  is  from  (4.26).  Obviously  (J™=1  Zv  =  Sd_1. 

Let  77  be  the  set  of  all  tiles  7j  with  centers  cuj  £  Zv.  It  follows  by  (4.26)  that 
\Jre%  T  Z  Z*  and  hence 

\T\<\Z;\:=  [  1  d(<cl  ld(=:c\Zv\,  v  =  l,2 ,...,n.  (4.27) 

We  are  now  ready  to  estimate  Qn(/Cm(«  •  rj)).  If  v  =  1,  then  we  obtain,  using 
(4.25),  (4.27),  and  the  assumptions  of  the  lemma, 

£  AjM“3-e'.)<  cmax{/Cm(cos  0i)  :  0  <  0[  <  7r/n}  £  m 

TeTj 


<  cmd-1|^1|  <  cm^1  /  '  sin"-2^^)  <  c(m/n)Q 

Jo 

If  n  >  2,  then 

<  cmax{/Cm(cos  0()  :  7r(z/  —  l)/n  <  <  7ns/n}  £  |T| 

7r(n  —  1) 


t /r 


\d— 1 


T 


<  dC  r 


COS 


\Z* 


n 


<  cJCm  (cos  — 'j  |J2f,|  <  c  /  £m(£  •  e))  7<7, 

\  7%  /  J  2r/ 


where  we  used  that  /Cm  (cos  <  c)Cm  (  cos  ,  v  >  2,  which  follows  by  the 

definition  of  /Cm( cosy)  from  the  assumptions  of  the  lemma.  Therefore 


XI  ’  ei)  -  c(m/n)d  1  +  c  f  K,m(£  ■  e[)d£, 

jeJ„  Jz 


(4.28) 
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where  Z  :=  U”_2  %v.  We  obtain,  using  again  the  definition  of  tCm(cos  g), 

/  JC4'ei)^  =  |Sd“2|  ["  /Cm(cos^)sind-2^d^ 

t/  2  J  7r/n 

roo 

<  c  /  min{md-\  m^/rm^M)*"2  <  c  <  oo. 

Jo 

The  above  estimates  and  (4.28)  imply  (4.24).  □ 

We  shall  deal  with  discrete  sums  of  spherical  polynomial  values.  For  this,  we 
need  a  rapidly  decaying  reproducing  kernel  for  the  space  of  spherical  polynomials  of 
degree  to.  The  following  well-known  proposition  gives  us  such  a  kernel. 

Proposition  4.1  There  exists  a  constant  m0  =  m0(d )  such  that  for  every  m  >  to0 
there  exists  an  algebraic  polynomial  Wm  of  degree  dm  with  the  properties: 

(a) 

%)=  L  wm(rrQS(t)dt,  n  <E  S*-1, 

./S'2-1 

for  each  spherical  polynomial  S  of  degree  <  m, 

(b) 


|VFm(cos  t)|  <  c0  min{TOd  1,md  1  /  (tot)^}  for  0  <  r  <  7r,  (4.29) 


and  hence 


/  \Wm(r)-Z)\d£  <  c<  oo,  ^S'-1,  (4.30) 

43d-1 

where  c0  and  c  are  independent  of  to  and  g. 

Since  we  do  not  have  a  good  reference  for  Proposition  4.1,  we  shall  show  how  it 
can  be  deduced  from  the  following  results  of  E.  Kogbetliantz  and  E.  Stein  (see  also 

[P]): 

Proposition  4.2  [K]  Let  Sm(t )  :=  J2'f=0(v  +  A )Cf(t),  A  >  0,  to  =  0, 1, . . .,  and  let 
crP)  be  the  Cesaro  means  of  order  6  of  Sm,  i.e. 

1  m 

ai»(t)  :=  (y„)"  y  +  A)Cpi)  with  A},  :=  r^t+ir  (4'31) 

is=0 

Then,  for  —1  <  8  <  2A  +  1, 

|crP^(cos  y)|  <  cmin  |(to  +  1)2A+1,  (to  +  1)2A_5  ^^sin  ^  j,  0  <  7  <  7r,  (4.32) 
with  c  depending  only  on  A. 
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Proposition  4.3  [St]  For  each  positive  integer  r  and  for  m  =  0,1,... ,  there  exist 
r+1  parameters  ay  (to), .  .  . ,  ar+\(rn)  (depending  only  on  m  and  r)  which  are  uniformly 
bounded:  \av(m)\  <  A,  A  independent  of  m,  and  there  exists  a  fixed  integer  N,  so 
that  the  following  holds: 

If  Jfffo  av  is  a  series  of  real  numbers  and  if  cr^ff  m  =  0,1,...,  are  the  Cesaro 
means  of  order  r  of  the  partial  sums  Sm,  m  =  0,  1, .  . .,  of  this  series  (see  (4.31)^),  then 


Tm  ■=  +  «2(m)42-i  +  •  •  •  +  ar+i(m)(j((^1)m_1 


(r) 


» 


can  be  represented  in  the  form 


m  (r  +  l)m 

f=0  1 

where  f3 v  are  constants  depending  on  m  and  r. 


Proof  of  Proposition  4.1.  We  have  already  mentioned  in  (3.15)  that 


K  (A  -  N(dim)  r(d- 

m[)'~  is--1 1 c(t2)l2(i)  m  [  ) 


gives  the  reproducing  kernel  Km(f  ■  g)  for  Ttm-  Therefore,  Yff=  o  Ku{f  •  g)  is  a  repro¬ 
ducing  kernel  for  all  spherical  polynomials  of  degree  <  to.  Simple  calculations  show 
that 

Km(t)  =2  [|Sd-1|(d-2)]“1(TO  +  X)Cxm{t)  with  A  :=  (d  -  2)/2. 

Therefore,  2  ||Sd_1  |(d  —  2)|  Ylff=o{v  +  A)C()(i)  gives  a  reproducing  kernel  for  the 
spherical  polynomials  of  degree  <  to. 

We  now  apply  Proposition  4.2  with  A  :=  (d  —  2)/2  and  6  :=  2A  +  1  =  d—  1.  Then 
we  apply  Proposition  4.3  to  the  resulting  Cesaro  means  {crhl}  with  r  :=  6  =  d  —  1 

to  conclude  that  Wm  :=  2  |jSd-1  \(d  —  2)j  satisfies  (4.29)  (by  (4.32)  and  since 

ay  (to)  are  uniformly  bounded)  and  Wm(f  ■  g)  is  a  reproducing  kernel  for  the  spherical 
polynomials  of  degree  <  to  (by  Proposition  4.3).  □ 

Lemma  4.2  and  Proposition  4.1  allow  us  to  estimate  discrete  lp(Qn)  norms  of 
spherical  polynomials  by  their  Xp(Sd-1)  norms.  In  this  part  we  use  ideas  from  [0]. 


Lemma  4.3  Let  n  =  1,2, .  .  .,  and  let  m  >  to0,  where  m0  is  from  Proposition  4.1. 
Then  for  every  spherical  polynomial  S  of  degree  m  and  for  1  <  p  <  oo  we  have 


E  -US'Mr  <  41  +  (m/nty  I  isioyc 


where  Aw  and  Qn  are  from  (4.19),  and  c  is  independent  of  S ,  n  and  to. 


(4.33) 
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Proof.  By  Proposition  4.1  we  get  S(uj)  =  fgd-i  Wm(co-£)S(£)  d£,  to  G  fln.  We  obtain, 
using  Holder’s  inequality, 

is(«ii  <  L ,  iw.(« •  f)5(oi«  =  L ,  WrM  •  or*'’w.<« •  oi^isifiK 

JSd~1  JSd~1 

<  (fdi  \wm(u ■  oi <;e)1_1/'  (/^  in--m(a,  ■  oiioor di)'h 

and  hence 

ISiu^KA?-1  f  \Wm(u-0\\S(0\Pd^  where  A  :=  fgd-i  \  Wm(to  •  £)|  e?£. 

J  S'2"1 

We  now  multiply  both  sides  of  the  above  inequality  by  Aw  and  sum  over  u  £  Qn  to 
obtain 


£  VISMI-  <  A”-'  I  £  A„|H'-m(u,.OI  15(01-^ 

<A^  max  Q4|Wm(.-0l)  /  JW^- 

CeSrf-1  J sd_ a 

It  follows,  by  (4.29),  that  |LUm(«  •  £)|  <  /Cm(«  •£),  where  tCm  is  defined  in  Lemma  4.2 
with  c0  from  Proposition  4.1.  Then  Proposition  4.1  and  Lemma  4.2  imply 

max  Qn(|LLm(«  •  £)|)  <  max  Qn(/Cm(«  •  £))  <  c[l  +  (m/n)^-1]  and  A  <  c 
CeS^-1  ceS^-1 

which  completes  the  proof  of  Lemma  4.3.  □ 

The  following  lemma  relates  the  T2(S<i_1)  norms  and  discrete  /2(0n)  norms  of 
spherical  polynomials  written  in  terms  of  ■  io)/V(m(l),  the  reproducing  kernel 

for  the  space  Ttm  ®  74m_2  ®  •  •  •  ®  Ttt  (see  (3.17)). 

Lemma  4.4  Let  n  =  1,2, .  .  and  let  c(u),  u  G  fln,  be  real  constants.  Let  m  >  m0, 
where  m0  is  from  Proposition  4.1.  Then,  the  spherical  polynomial 

S(0  ■=  Y  X^)jJ-J7-Mrn(i  -w) 

a.£fl„  tAm{A) 


satisfies 

Ili'lltlS..-.)  <  41  +  (m/")1'-1]  £  A„|CM|2.  (4.34) 

LU^ein 

Proof.  Using  (3.11)  we  get 


|  S' | 


2 

L2 


(Sd-1) 


Cd- 1 


m)\2d( 


Y  Y  WA vc(uj)c(y) 

LU^ein 


Um{l  ■  io)Um{i  ■  r/)  d( 
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=  I]  Z  X^XA^)c{v)jp77T Um{u  ■  rj)  =  Z  XAf])S{f]) 

cuennr]enn  r]eQn 

<  ( Z  (  Z  xv\s(v)\2 

\V£Qn  )  \V£Qn 

By  Lemma  4.3,  the  last  quantity  above  does  not  exceed  c[l  +  (m/n)^-1]^  || *5'||z/2 (Sd— i) - 
Finally,  dividing  by  || *5f||z/2(Sd— i)  completes  the  proof  of  the  lemma.  □ 


5  Smoothness  spaces  in  ^(B^) 

In  this  section,  we  shall  recall  results  about  approximation  by  algebraic  polynomials. 
As  earlier,  we  let  Vn  denote  the  space  of  algebraic  polynomials  in  d-variables.  For 
n  >  1,  let 

EJJ)  :  =  EJf)L,tm  :=  inf  ||/  -  P ||Ij(B-) 

j  t  r  n 

be  the  error  in  approximating  /  £  L2{^d)  by  algebraic  polynomials  P  of  degree  <  n. 
By  Theorem  3.1  we  have  the  following  representation  of  the  polynomial  Pn(f,  x)  of 
best  T2(B<i)-approximation  to  /: 


Pn(f,x)  =  Z 

m— 0 

where 

An(0  :=  An(f,()  :=  f  f{y)Um(y  ■  ()dy. 

J  Bd 

Since  Am(£)Um(x.  •  £)  is  a  spherical  polynomial  of  degree  <  2 m  <  2 n  in  £,  we  can  use 
the  quadrature  formula  (4.19)  to  obtain 

n 

Pn(/,x)  =  Z  ^  Z  vmAm(u)Um(-x.  ■  u).  (5.2) 

uj^Q,n  m— 0 

From  Theorem  3.1,  we  have 

Ejj)2  =  \\i  -  pjdwi^b")  =  y,  ‘/mii.4,„(/)iy(sj-i) 

m>n 

x  Z  m<i  1 1|  4lm  (/)  1 1  (Sd— 1 )  •  (5-3) 

m>n 

For  a  >  0,  let  Wa(L2(Tid))  be  the  Sobolev  space  for  the  domain  Bd.  When 
a  =  k  is  an  integer,  then  a  function  /  £  L2( B  d )  is  in  Wk(L2(Bd ))  if  and  only  if  its 
distributional  derivatives  Dv  f  of  order  k  are  in  L2(Tid),  and 

\f\wk{L2{Bd))  :=  Z  \\DU f\\2L2(Bd) 

W\=k 


j  d_iAm(0^m(x-0^,  (5.1) 
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gives  the  semi-norm  for  Wk(L 2(Bd)).  The  norm  for  Wk  (L2(Bd))  is  obtained  by  adding 
l|/||.L2(Bd)  t°  \f\wk(L2(Bd))-  For  other  values  of  a,  we  obtain  Wa  as  the  interpolation 
space 

Wa(L2(Bd ))  =  (M Bd),  Wk(L2(Bd)))e,2 ,  0  =  a/k ,  0  <  a  <  k, 

given  by  the  real  method  of  interpolation  (see,  e.g.,  Bennett  and  Sharpley  [BS] ) . 

A  fundamental  result  in  approximation  known  as  the  Jackson  theorem  states 

that 

En(f)  <  c(k)n-k\\f\\Wk{L2{Bd))}  (5.4) 

where  the  norm  on  the  right  can  be  replaced  by  the  semi-norm  if  k  is  an  integer. 
This  theorem  can  be  deduced  easily  from  the  results  on  univariate  approximation  in 
Chapter  7  of  [DL].  By  interpolation  (see,  e.g.,  [DL,  Chapter  7]),  one  obtains 

OO 

£  c(q)||/||J,,„(Lj(BJ)),  a  >  0,  (5.5) 

71=1 

with  c(a)  depending  at  most  on  a.  From  (5.3)  and  (5.5),  it  is  easy  to  deduce  that 

OO 

X]  n2a+d  1  ||-4n.(/) |||,2(Sd-i)  <  c(a)l|/|lm«(L2(Bd)),  a  >  o,  (5.6) 

71  =  1 

with  c(a)  depending  at  most  on  a. 


6  Approximation  of  functions  in  1,2(1,  w ) 

We  shall  also  need  certain  results  about  the  approximation  of  univariate  functions 
in  L2(I,w)  where  /  :=  [—1,1]  and  w  :=  wci/2.  As  we  know  by  §2,  the  Gegenbauer 
polynomials  {77m}“_0  form  a  complete  orthonormal  system  for  L2(I,w)  (see  (3.3)). 
For  any  g  £  L2(I,  w)  we  have 

n 

g  =  g(m)Um  with  g(m)  :  = 

m= 0 

We  shall  use  approximation  of  functions  in  L2(I,w)  as  an  intermediate  tool 
in  establishing  our  results  on  ridge  approximation.  Let  Vn(I)  denote  the  space  of 
univariate  algebraic  polynomials  of  degree  <  n.  For  a  function  g  £  L2(I}w)}  we  let 

En(g)L2(i,w)  ■=  inf  llflf  -  p\\l2(i,w) 

peVn{I) 

be  the  error  in  approximating  g  by  the  elements  of  Vn(I).  The  polynomial 

n 

Pn  ■=  9(m)Urn  (6-2) 

771  =  0 


g(s)Um(s)w(s)  ds. 


(6.1) 
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is  the  best  L2(I,w)  approximation  to  g  by  elements  of  Vn(I)}  and  we  have 

En(g)l2(i,w)  =  I \g  -  Pn\\l2(I,w)  =  J2  1 9im)\2-  (6-3) 

m>n 

We  introduce  the  univariate  Sobolev  spaces  Wa(L2(I}  w)),  «  £  R,  whose  norms 
are  defined  by 

OO 

Uw-WM)  ■■=  E[(™  +  l)1»(m)l]2.  (6.4) 

m— 0 

It  follows  that  for  each  g  £  Wa(L2(I }w))} 

En(g)L2(I,w)  <  c(a)n~a  \\g\\wa(L2(I,w))  ■  (6.5) 

Moreover,  similar  to  (5.5),  we  have 

OO 

<  c(a)\\g\\2wa{L2{ItW)),  a  >  0.  (6.6) 

71  =  1 

There  is  also  a  Bernstein  type  inequality  for  polynomials  in  Vn(I)  with  respect 
to  L2(I,  w)  which  follows  trivially  from  the  definition:  for  every  p  £  Vn(I )  and  a  >  0, 

\\p\\w^(L2(i,w))  <  (n  +  l)a\\p\\L2(i,w)-  (6.7) 

It  is  well  known  (see  [DL,  Chapter  7])  that  companion  inequalities  like  (6.5)  and 
(6.7)  imply  a  characterization  of  approximation  spaces  by  interpolation  spaces.  In 
our  context,  the  approximation  spaces  are  the  Sobolev  spaces  Wa(L2(I}  w))  defined 
by  (6.4)  and  we  therefore  obtain  for  each  0  <  a  <  k} 

Wa(L2(I}  w))  =  ( L2(I,  w),  Wk(L2(I7  w))e,2 ,  9  =  a/k.  (6.8) 

Further  properties  of  the  spaces  Wa(L2(I }  w))  are  given  in  §7. 

7  Approximation  by  ridge  functions 

In  this  section,  we  assume  that  Xn  is  a  subspace  of  L2(I}w)}  w  =  Wd/2,  of  dimension 
n  with  the  following  property.  There  is  a  real  number  s  >  0  such  that,  for  each 
univariate  function  g  £  WS(L2(I }w))}  there  is  a  function  r  £  Xn  which  provides  the 
Jackson  estimate 

I \g  -  A \l2(i,w)  <  con~s\\g\\wpL2(i,w)),  (7.1) 

with  c0  a  constant  independent  of  g  and  n. 

We  define  Yn  to  be  the  space  of  functions  R  in  d  variables  of  the  form 

i?(x)  =  ^(x-cu),  rw  £  Xn,  lo  £  CLn,  (7.2) 

where  Qn  is  the  set  of  vectors  in  S^1  from  (4.19).  Then,  Yn  is  a  linear  space  of 
dimension  <  n# Qn  <  cnd .  We  prove  the  following  theorem  about  approximation 
from  Yn. 
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Theorem  7.1  Let  Xn,  n  =  1,2, .  .  .,  satisfy  inequality  (7.1)  for  some  s  >  0.  If  f  is  a 
function  from  the  space  Ws+~ (L2(Bd)) ,  then  there  is  a  function  R  in  Yn  such  that 

11/ “  -R|U2(Bd)  <  cn  S  2  ||/||rm+(d-i)/2(L2(Bd))  (7-3) 

with  c  a  constant  depending  only  on  s  and  d. 

Remark  7.1  If  s  +  (d  —  l)/2  —  1  is  an  integer  and  the  space  Yn  contains  Vs+{d-i)l2-i, 
then  ||/||  vrs+(d-1)/2(L2(Bd))  can  be  replaced  by  the  semi-norm  \f\ws+(d-1)/2(L2(Bd))  ■ 

An  important  element  of  the  proof  of  Theorem  7.1  is  the  idea  to  get  rid  of 
the  ’’low  frequencies”  when  approximating.  To  this  end  we  shall  use  the  following 
geometric  construction  which  was  proven  for  us  by  Boris  Kashin. 


Lemma  7.1  Let  H  be  a  Hilbert  space  with  norm  ||  •  ||  and  let  A,  B  C  H  be  finite 
dimensional  linear  subspaces  of  H  with  dim  A  <  dimP.  If  there  exists  h,  0  <  6  <  1/2, 
such  that 


sup  inf  \\x  —  y\\  <  h, 

(7.4) 

then  there  is  a  constant  c 

X  e  A  y  e  B 

M  —  7 

depending  only  on  6  and  a  linear  operator  L  :  A  - 

->■  B  such 

that  for  every  x  £  A 

\\Lx  —  x\\  <  c  inf  \\x  —  ?/  , 

(7.5) 

and 

yeB 

Lx  - 

-  x  1  A  (Lx  —  x  is  orthogonal  to  A). 

Proof.  See  [DOP],  Lemma  6.  □ 

Proof  of  Theorem  7.1.  Estimate  (7.3)  trivially  holds  if  n  <  m0,  where  m0  =  m0(d) 
is  the  constant  from  Proposition  4.1. 

Suppose  that  n  >  m0.  Let  P  =  Pn  be  the  polynomial  in  Vn  given  by  (5.1)  (or 
(5.2)).  Since  P  is  the  best  P2(Bd)  approximation  to  /,  it  satisfies  (see  (5.4)) 

11/ “  -P||i2(Bd)  <  cn  s  1^2||/||ms+(d-1)/2(L2(Bd))  (7-6) 

with  c  and  all  subsequent  constants  in  this  proof  depending  only  on  s  and  d.  We 
shall  approximate  P  by  an  element  R  of  Tjy,  N  =  k0n}  where  k0  is  a  sufficiently  large 
constant  depending  only  on  s  and  d. 

We  have  Am(P,  £)  =  Am(/,  £),  m  <  n,  and  Am(P,  £)  =  0,  to  >  n.  Since 
/  G  LEs+(<i-1)/2(P2(B<i)),  we  know  from  (5.6)  that 

n 

5Z(m  +  l )2s+2^d  1 1 Am (/) 1 1 n2(sd_  1 )  ^  cll/llms+(d-1)/2(L2(Bd))-  (7-7) 

m— 0 
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From  this,  using  (4.23),  we  obtain 

n 

£(™  +  lywi-'-D  £  <7'8) 

m= 0  w£S!„ 

We  introduce  the  univariate  polynomials 

n  n 

Puj(yt^)  • —  ^  ^  VrmAjn  (y?  ^)77m  (^)  —  ^  ^  (-P5  ^)77m  (^)  5  ^  ^  (7.9) 

m=0  m=0 

We  have,  by  (4.22)  and  (7.9), 

n 

P(x)  =  y)  z/m  y)  AwAm(P,cu)Wm(x  •  u)  =  VMx-cu).  (7.10) 

7TL  —  0 

According  to  (6.4),  we  have 

n 

\\K!2Pu\\ws(L2(I,w))  =  E(-  +  i)2s^^|Am(/,c)i2 

m=0 

n 

x  E(™  +  1)2'+2|J-1|A„|V„(/.X)|2. 

m=0 

Hence,  from  (7.8), 

n 

£  llAyVfc.^,,,.,)  <  c£(m+l)2-+2'-'-»  £  A„|(.4„.(/.x))|2 

771  —  0 

^  cll/llms+(d-1)/2(L2(Bd))-  (7-11) 

We  shall  approximate  each  polynomial  p w  by  elements  of  Xn-  We  apply  Lemma  7.1 
in  the  following  setting.  We  take  for  H  the  Hilbert  space  P2(/ ,  w)  and  take  A  =  Vn(I) 
and  B  =  Xn  with  N  >  k0n  and  k0  a  positive  integer.  We  next  show  that  if  k0  is 
large  enough  then  the  assumption  (7.4)  is  satisfied.  We  mentioned  earlier  in  (6.7) 
that  Vn(I)  satisfies  the  Bernstein  inequality 

\\p\\ws(L2(i,w))  <  {n  +  l)s|blU2(j,^),  P  G  'Pn(I). 

If  p  £  'Pn(I)  then,  from  this  Bernstein  inequality  and  from  (7.1),  there  is  an  r  £  Xn 
such  that 

lb  —  rll l2(i,w))  <  c0N~s\\p\\wpL2(i,w))  <  c0W_s2sns||p||i2(/jU,)  <  c02s ||p||z,2 (x,^) - 

Thus,  if  k0  is  large  enough,  condition  (7.4)  is  satisfied.  Therefore,  for  each  u  £  On, 
we  can  find  £  Xjv  such  that  —  p^  T  'Pn(I)  with  respect  to  the  inner  product  in 
L2(I}w)  and,  by  (7.1)  and  (7.5), 

llw  ~  ru\\L2(I,w)  —  cn  2S\\Pu\\ws(L2(I,w))- 
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Therefore 


(7.12) 


-p»  =  E 


m=n-\- 1 


rw(m)  :=  J  rCJ(s)Urri(s)w(s)  ds 


J2  Mm)T  =  \\Pw  ~  ru\\2L2{itW)  <  cn 

m=n-\- 1 


-2s  „  2 

Pu  Ws(L2(I,w)y 


(7.13) 


We  define 


i?(x)  :=  ^r4x  '  u) 

C 


which  is  an  element  of  Tjy  Then  we  have,  by  (7.10)  and  (7.12), 


i?(x)  -  P(x)  =  A wrw(rn)Um(u  ■  x)  =  ^  Att)rtt)(m)77m(cLi  •  x). 

u;G^n  ra=n+l  ra=n+l  u;G^n 


We  write 


7?m(x)  . —  ^  )  A^r^ (?n)77m (x  •  cu) . 


We  have  by  Theorem  3.1  (see  also  (3.19)  -  (3.21)) 


R'rri  (x)  —  Vm  /  Am  (Pm  ?  C )77m  (C  '  X)  5 

2Sd-! 


where 


Am(Pm,C)  =  /  Pm(y)77m(y  •  O^y  =  Awra,(m)  /  77m(cu  •  y)77m(«f  •  y)  dy 

JBd  JBd 

c^G*2n 

=  EAV„(m)%h-^. 

W£iJ„  Um{l) 

We  now  use  Theorem  3.1  and  Lemma  4.4  to  obtain 

II Pm IIl2 (Bd)  =  l'm\\A-m(Rm,uj)\\L2^gd-i)  =  vm  II  ^  A wrw(m)—  7-— 77m(C  '  cu)||L2(Sd-1) 

cuenn 

<  cn”1  (m/n)^-1  ^  Aw |rw(m) |2  <  cn“d+1  ^  Aw |rw(m) |2, 

C^G^n  u;G^n 

where  we  used  that  um  X  md  1  (see  (3.8)).  From  this,  (7.11),  and  (7.13),  we  find, 
using  the  Parseval  identity  (3.9), 


1177-  r|li,(Bi)  =  E  II77.»IILb*)  <  E  E  Vln,(> 

m=n-\- 1  ra=n+l  u;G^n 
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=  cn  d+1  Y  ^  Y(m)\2 

uj^Q,n  m=n-\- 1 

<  cn~2s~d+1  Y  \K/2p4ws(l2(i,u,)) 

C o(z£ln 

<  cn  2s  d+1  ||/||^s+(d-i)/2(i2(Bd))- 

Thus  there  is  a  function  R  £  Tjv,  N  =  k0n}  such  that  (7.3)  holds.  Theorem  7.1  is 
now  proved.  □ 

Remark  7.2  As  in  [DOP],  it  is  possible  to  prove  Theorem  7.1  without  using  Lemma  7.1. 
In  place  of  this  lemma  one  uses  a  slightly  stronger  assumption  than  estimate  (7.1). 
The  corresponding  proof  would  be  more  constructive  than  the  present  one.  ITe  do  not 
provide  the  details  of  this  approach  but  instead  refer  the  reader  to  [DOP]. 


8  Elimination  of  the  weight  w 

The  result  of  §7  (Theorem  7.1)  gives  sufficient  conditions  on  a  sequence  of  univari¬ 
ate  spaces  Xn,  n  =  1,2,...,  in  order  that  the  spaces  Yn  defined  by  (7.2)  with  fln 
from  (4.19)  provide  approximation  rates  for  functions  in  Sobolev  spaces  Wa(L2(Hd)) 
comparable  to  polynomials  and  splines.  However,  the  assumption  (7.1)  imposed  on 
Xn  is  inconvenient  for  direct  application  because  of  the  appearance  of  the  weight 
w(t)  :=  w(i/2(t)  :  =  (1  —  t2)(d-1)/2.  We  shall  show  in  this  section  how  the  weight  factor 
w  can  be  avoided  so  that  the  result  of  §7  applies  more  directly.  We  shall  consider  ap¬ 
proximation  on  the  ball  B^2  :=  {x  £  Rd  :  |x|  <  1/2}  rather  than  Bd.  Approximation 
on  Bd  or  other  balls  follows  by  a  change  of  variables. 

We  begin  by  assuming  that  we  have  in  hand  n-dimensional  linear  spaces  Zn  of 
univariate  functions  defined  on  J  :=  [—1/2, 1/2]  which  satisfy  a  Jackson  type  estimate 
similar  to  (7.1)  but  with  weight  =  1.  Let  Wm(L2(J)),  m  =  1,2, .  .  .,  be  the  Sobolev 
space  of  functions  g  £  L2(J)  such  that  g is  in  L2(J).  The  semi-norm  and  norm  for 
Wm(L2(J))  are  defined  by 

I g\w^(L2(j))  ■■=  \\g{m)\\L2{j)  ;  | \g\\w^(L2(j))  ■=  Il5,(m)lk2(j)  +  IMImt)- 

For  0  <  s  <  m  not  an  integer,  we  define  WS(L2(J))  by  interpolation: 

Ws(L2(  J))  :=  (L2(J),  Wm(L2(  J))e,2,  9  :=  s/m,  (8.1) 

with  the  norm  the  interpolation  space  norm.  For  a  given  value  of  s,  different  values 
of  to  >  s  give  equivalent  norms  (see  [DL]). 

Our  assumption  on  Zn  is  that  for  a  certain  fixed  value  of  s,  we  have  that  for 
each  g  £  WS(L2(J) ),  there  is  a  function  (n  £  Zn  such  that 

I \g  -  Cn\\L2(j)  <  c{s)n~s\\g\\Ws(L2(j))  (8.2) 
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with  the  constant  c(s)  depending  only  on  s. 

Let  Xn  be  the  space  of  univariate  functions  r  such  that  for  some  p  £  Vn(I )  and 
some  (  £  Zn 


r(t) 


p(t ),  tei\J, 
C(f),  t  £  J . 


(8.3) 


We  shall  show  that  under  the  assumption  (8.2)  on  the  Zn,  the  spaces  Xn,  n  = 
1,2, .  .  .,  satisfy  the  assumption  (7.1).  To  prove  this,  we  recall  the  definition  (6.4)  of 
the  spaces  Wa(L2(I}  w))  and  the  operator  A  of  (2.12): 


A  9  ■ 


d_ 

dt 


d— 1 


\W 


9] 


(8.4) 


According  to  (2.15),  we  have  A2Un  =  (  —  l)d  1  pdnUn.  Since  pn  X  nd  1  (see  (2.16)),  it 
follows  that  for  each  g  £  LbmA(iv2(/,  u;)),  A  =  2 (d  —  1),  m  =  1,  2, .  .  .,  we  have 

\\g\\wmX{L2{i,w))  x  \\fdm 9\\l2{i,w)  (8.5) 

with  the  constants  of  equivalency  depending  only  on  d. 

Lemma  8.1  For  each  m  =  k\,  with  A  :=  2 (d  —  1)  and  k  a  nonnegative  integer,  we 
have 

\\g\\wm(L2(j))  <  c(d,m) \\g\\wm(L2(i,w))  for  g  eWm{L2{I,w))  (8.6) 

with  the  constant  c(d,m)  depending  only  on  d  and  m. 

Proof.  We  first  observe  that  the  weight  w  is  strictly  positive  on  J  and,  therefore, 
w_1  is  infinitely  times  differentiable  on  J .  Then  the  following  identity  holds 

i(d- i)-i 

g(i(d-i))=  J-  ujgW  +  u^Atg,  7=1,2,...,  (8.7) 

3= 0 

where  Uj  are  obtained  from  w_1  and  its  derivatives.  Indeed,  (8.7)  can  be  proved  by 
induction  on  L  For  7  =  1,  (8.7)  follows  from  Leibniz’  formula  for  differentiating  the 
product  g  =  w~1(wg).  Suppose  that  (8.7)  holds  for  some  7  >  1.  Then  one  writes  Afg 
as  w-1  [wAfg^  and  differentiates  both  sides  of  (8.7)  d  —  1  times  to  prove  it  for  7  +  1. 
It  follows  from  (8.7),  with  7  =  2 k  and  m  =  k A,  that 

m  —  1 

\\g{m)\\L2(j)  <  \\gb)\\L2(j)  +  c\\^k9\\L2(j)-  (8-8) 

3= 0 

We  shall  use  next  the  following  well-known  inequality  (see  e.g.  [BS] ) 

ll9(:,)llfa(J)  <  <Wll9lk(j)  +  V’-’||9(’",|k(J,)  .  J  =  l,2,...,m, 
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(8.9) 


where  8  >  0  is  arbitrary  and  c  depends  only  on  to.  Combining  (8.8)  with  (8.9)  we 
get,  for  0  <  8  <  1, 


ll»(m)llfa(j)  Sc'^’^'llsllfaiJi  +  c^lla'^llfaui  +  eilAyilfau,.  (8.io) 

where  c*  >  1  is  independent  of  8.  We  now  select  8  such  that  c* 8  =1/2  and  bring  the 
second  term  on  the  right  in  (8.10)  to  the  left-hand-side.  We  obtain 

h{m)\\L2(J)  <  c{\\9\\l2(J)  +  \\kkg\\L2(J)) 

<  c(\\w9\\l2(i)  +  \\wAkg\\L2(i)) 

<  c\ \g\\ w™{l2  (/,«;))• 


□ 

Theorem  8.1  If  the  sequence  of  spaces  Zn,  n  =  1,2,...,  satisfies  (8.2),  then  the 
spaces  Xn,  n  =  1,2,...,  defined  by  (8.3)  satisfy  the  Jackson  estimates  (7.1),  i.e.  for 
each  univariate  function  g  £  Ws(L2(I}w)),  there  is  a  function  r  £  Xn  which  provides 
the  Jackson  estimate 


\\g  -  r\\L2(i,w)  <  cn  s\\g\\wgL2(i,w)),  (8-ii) 

with  c  a  constant  independent  of  g  and  n. 

Proof.  Consider  the  linear  operator  T  that  associates  with  every  function  g  £ 
L2(I,w)  the  restriction  of  g  on  J.  Since  w  is  strictly  positive  on  J,  T  is  a  bounded 
operator  from  L2(I,  w)  into  L2(J).  By  Lemma  8.1,  T  is  bounded  from  Wm (L2(1 7  w)) 
into  Wm(L2(J))  for  each  m  =  2k(d  —  1),  k  =  1,  2, .  .  ..  This  implies  that,  for  each 
0  <  s  <  2k(d  —  1)  and  6  :=  s/(2k(d  —  1)),  we  have  by  interpolation  (see  (6.8)  and 
(8.1))  that  for  each  g  £  WS(L2(I^  re)), 

\\g\\ws(L2(J))  X  \\9\\(L2(J),W2k(<i-i)(L2(J)))g2 

—  C\\h\\^L2(I,w),W2k(d-1)(L2(I,w)))e2  X  \\9\\ws(L2(I,w)))- 

Now,  given  g  £  LLS(T2(/,  u;)),  we  let  (  £  Zn  satisfy  (8.2).  Then,  from  (8.6), 

\\g  -  C\\l2(.j)  <  cn~ s \\g\\w s (l2(j))  <  cn~s\\g\\ws(L2(i,w))- 

Similarly,  let  p  be  the  best  approximation  in  L2(I,w)  to  g  from  Vn(I)  .  Then,  from 
(6.5), 

h  -  p\\l2(i,w)  <  n  s\\g\\w°(L2(i,w))- 

It  follows  that  the  function  r  £  Xn  defined  by  (8.3)  for  these  (  and  p  satisfies  (8.11). □ 
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Theorem  8.2  If  the  sequence  of  spaces  Zn,n  =  1,2,...,  satisfy  (8.2),  then  for  any 
function  f  £  Vhs+^_1^2(iv2(B^2)),  there  are  functions  rw  £  Zn  such  that 


i?(x)  =  r4^  • x) 

u;EQn 


satisfies 


11/  -  r\\l2(B1/2) 


<  cn~s 


with  c  independent  of  f  and  n. 


W°  +  (d-l)/2(L2(Bf/2)) 


(8.12) 


(8.13) 


Proof.  We  first  recall  (see  e.g.  [A,  Chapter  IV])  that  /  can  be  extended  to  a  function 
/o  defined  on  all  of  Rd  such  that  f0  vanishes  outside  of  B and 

||/o||rrs+(d-1)/2(L2(Bd))  <  cll/||rrs+(d-1)/2(L2(Bd/2)) 

with  a  constant  c  depending  only  on  s  and  d. 

We  define  Xn  as  in  (8.3).  From  Theorem  8.1,  we  obtain  that  condition  (7.1)  is 
satisfied.  Therefore,  from  Theorem  7.1  there  are  functions  £  Xn,  u  £  fin,  such 
that  the  function 

i?(x)  =  rw(u  •  x) 

satishes 


||/o  —  7?||i2(Bd)  <  cn  r  (d  1^2 1| /o || vFr+(d-1)/2 (L2(Bd)) 

<  cn  r  (d  1)/2||/||rw+(d-1)/2(L2(Bd/2))-  (8-14) 

On  the  ball  B^2,  fo  =  f  and  is  in  Zn  for  each  to  £  fln.  Therefore,  (8.13)  follows 
from  (8.14).  □ 

9  Examples  and  further  remarks 

In  this  section,  we  shall  give  some  applications  of  the  results  of  §8.  Theorem  8.2 
implies  that  for  any  sequence  of  spaces  Zn,  n  =  1,2,...,  contained  in  T2(J),  J  = 
[—1/2,  1/2],  that  satisfy  (8.2)  we  have  the  estimate  (8.13)  for  /  £  Ws+^d~1^2(L2(J)). 
The  condition  (8.2)  is  satisfied  by  all  the  standard  spaces  of  approximation  such 
as  algebraic  polynomials  and  spline  functions  (discussed  in  more  detail  later  in  this 
section).  We  wish  to  single  out,  for  further  elaboration,  one  particular  example  which 
appears  frequently  in  wavelet  theory,  as  well  as  computer  aided  design. 

Let  <f  be  a  univariate  function  with  compact  support  on  R.  Let  7  be  the  smallest 
integer  such  that  <f  or  one  of  its  shifts  <f>(x  —  k),  k  £  Z,  is  supported  on  [0,7].  If 
necessary,  we  can  redefine  <f  to  be  one  of  its  integer  shifts  and  thereby  require  that 
<f  is  supported  on  [0,7],  We  denote  by  S  :=  S(f)  the  shift-invariant  space  which  is 
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the  T2(R)-closure  of  finite  linear  combinations  of  the  shifts  </(•  —  j),  j  G  Z,  of  <f>.  By 
dilation,  we  obtain  the  univariate  spaces 

Sk  :=  {S(2k-)  :SeS},  ke  Z. 

The  approximation  properties  of  the  family  of  spaces  Sk  is  well  understood.  In 
[BDR],  there  is  a  complete  characterization  (in  terms  of  the  Fourier  transform  of  <f>) 
of  when  the  spaces  Sk  provide  the  Jackson  estimates 

dist(g,<Sfc)i2(R)  <  C2~ks\\g\\Ws(L2(K)y  (9.1) 

For  an  integer  s,  we  say  that  <f  satisfies  the  Strang-Fix  conditions  of  order  s  if 

</(0)  7^  0,  and  D-* cj)(2kir)  =  0,  k  (E  Z,  k  ^  0,  j  =  0, 1, . . . ,  s  —  1.  (9-2) 

If  <f>  satishes  (9.2)  and  <f>  is  piecewise  continuous  and  of  bounded  variation  then  Sk 
provides  the  approximation  estimate  (9.1)  (see  e.g.  [DL,  Chapter  13]). 

We  denote  by  Sk(J ),  k  >  1,  the  restrictions  of  the  spaces  Sk  to  the  interval 
J  :=  [—1/2, 1/2].  The  functions  f(2kt  —  j),  j  =  —  i  +  1  —  2fc_1, .  .  . ,  2fc_1  —  1,  span 
Sk(J).  Each  function  g  in  1FS(T2(J))  can  be  extended  to  R  with 

llfl'll vfs(l2(R))  <  cllfl,||myL2(j))- 

It  follows  therefore  that  the  spaces  Sk(J)  provide  the  approximation  property  (8.2) 
and  hence  Theorem  8.2  applies  with  n  =  2k .  The  functions  R  appearing  in  Theo¬ 
rem  8.2  are  of  the  form 

R{*)  =  J2  J2  cO'w^^x-cd-j). 

]  =  -£+ l-2k~1  w£SJ2t 

There  is  another  representation  of  the  functions  in  Sk(J)  related  to  sigmoidal 
functions.  Let 

OO 

—  (9-3) 

3=0 

Then  the  functions  a(2kt  —  j),  j  =  —  i  +  1  —  2fc_1, . .  . ,  2fc_1  —  1,  also  span  Sk(J).  The 
function  a  is  0  for  t  sufficiently  large  negative  and  1  for  t  sufficiently  large  positive. 
However,  it  is  not  necessarily  monotone  (without  additional  assumptions  on  <f>). 

Corollary  9.1  Let  </  satisfy  the  Strang-Fix  conditions  (9.2)  of  order  s.  Then  for 
each  function  f  £  LFs+^_1^2(iv2(B^2));  there  is  a  function 

R{*)  =  c(j,u)<T(2kx.-u-j) 

]  =  -£+l+2k~1  w6Si2t 

such  that 


with  c  independent  of  f  and  k. 
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For  certain  choices  of  <f  above,  we  obtain  that  a  of  (9.3)  is  a  sigmoidal  function 
in  the  terminology  of  neural  networks.  We  recall  that  a  sigmoidal  function  is  a 
nonnegative,  monotone,  univariate  function  which  has  limits  =  0  as  t  — >  —  oo  and  =  f 
as  i  ->  oo.  To  obtain  examples  of  such  sigmoidal  functions,  we  can  take  i)  to  be  a 
B-spline.  Let  <f  :=  W0jS,  where  for  each  j  £  Z  and  s  =  f ,  2, .  .  .,  NhS  :=  s~x MhS  is  the 
B-spline  of  order  s  (see  [DL,  Chapter  5])  with  breakpoints  .  . . ,  The  function 

OO 

Vs(t)  ■=  N3Af)i  1  C  R 

j  =  -s+ 1 


of  (9.3)  is  a  sigmoidal  function,  and,  in  the  case  s  =  1 ,  it  is  the  unit  impulse  function 
X[0oo)-  The  functions  as(t  —  ^),  j  =  —n, . .  .  ,n  -\-  s  —  1,  form  a  basis  for  SUjS  the 
space  of  all  splines  of  degree  5  —  1  defined  on  J  with  breakpoints  belonging  to  the  set 
{  ~^1 ,  ~2n2  ?  •  •  •  5  From  Theorem  8.2,  we  obtain  the  following. 

Corollary  9.2  For  any  f  £  LFs+^_1^2(iv2(B^2));  there  are  constants  c{k,u),  co  £ 
fln ,  k  =  —n, .  .  . ,  n  +  s  —  1,  such  that 

n+s-l  /  7,  \ 

^(X)  =  1C  I]  c(k,uj)as  (x-cu-  —  J  (9.4) 

tu(z£ln  k=  —  n  V  / 


satisfies 


11/  -  r\\l2(B1/2) 


<  cn~s -(d-T/2 


with  c  independent  of  f  and  n. 


Ws  +  (d- l)/2(L2(Bd/2)) 


The  functions  R  in  (9.4)  correspond  to  the  outputs  of  a  feed-forward  neural 
network  with  (9(nd_1)  nodes  of  computation.  Thus,  the  corollary  shows  that  such 
neural  networks  have  computational  efficiency  comparable  to  standard  methods  of 
approximation  like  splines  and  wavelets. 

The  special  case  s  =  1  in  Corollary  9.2  is  also  noteworthy.  In  this  case  the 
function  a  is  the  unit-impulse  function  and  the  functions  R  are  piecewise  constant. 
The  order  of  approximation  provided  by  Corollary  9.2  is  somewhat  surprising.  One 
might  expect  that  such  piecewise  constants  could  only  provide  approximation  order 
1  while  the  corollary  gives  approximation  order  (d  +  l)/2. 


10  Appendix 

Al.  Proof  of  (3.4).  Since  Vn  is  invariant  under  rotations,  it  is  sufficient  to  prove 
that  (P(x),Un(x i))  =  0  for  each  P  £  Vn-\  or  that 

(xm,V(n(x i))  :=  /  xmldn(x1)  dx  =  0  when  |m|  <  n  —  1. 

./Rd 
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Write 

We  have 


BX1  :  =  {x'  =  (x2,  .  .  .  ,  xd)  :  x22  +  .  .  .  +  x2d  <  1  -  x\}. 
(xm,Z4(zi))  =  J^x™1  ^  x™2  ■  ■ -x™d  dx^jlln(x1)dx1. 


Because  of  the  symmetry  it  is  obvious  that  the  inner  integral  above  is  equal  to  zero 
if  at  least  one  of  ra2, .  .  .  ,  m^  is  odd.  Consider  the  case  when  all  ra2,  .  .  . ,  are  even. 
We  now  change  the  rectangular  coordinates  in  the  inner  integral  to  spherical  and  find 

[  X™2  ...  x™d  dx!  =  C  [{1  Xl)  '  r^  +  -+md  +  d-2  dr  =  _  x2y-(m2  +  ...+md  +  d-l)  ^ 

Jo 

where  c  depends  on  d,  m2, .  .  . ,  mj.  Therefore 

(xm,c^))  =  C  J  -  xl)1Jm2+---+md'>Un(x1)(l  -xl)d-^dXl  =  0 

since  the  univariate  polynomial  Un  is  orthogonal  to  Vn-i (I)  in  L2(I,  w)  (see  (3.3)  and 

(2.1)).  □ 

A2.  Proof  of  (3.10).  We  first  show  that  for  each  g  £  L\(I,  w^-i)/2) 

TZ(g(g-x)]  ^t)  =  \Bd~2\(l  —  t2)~  j  </(cos  0  cos  ip-\-u  sin  0  sin  tp)(l  —  u2)~  du ,  (10.1) 

where  t  =:  cos  6,  t  £  /,  ip  £  [0, 7r]  is  the  angle  between  £  and  g  (cos  ip  =  p  •  g),  \Bd~ 2  |  is 
the  volume  of  the  unit  ball  Bd~ 2  in  Rd_2,  \Bd~2\  =  (d_2)r(d/2-i) ’  anc^  ^  the  ^a(ion 
transform  defined  in  (3.23).  Indeed,  it  is  easily  seen  that 

rVi-t2 

/-yTW 


py/l  —  t2  d_2 

7 Z(g(g  ■  x);  £,  t)  =  li^-2!  /  _ g(t  cos  ip  +  v  sin  1/1) (1  —  t2  —  v2)~  dv. 


Substituting  v  =  (1  —  t2)x!2u  in  the  above  integral  we  get  (10.1). 

Our  second  step  is  to  prove  that 

•  =0;  f,  *)  =  -  t2)^c^(t)cf2(i;  •  -;).  (10.2) 

Indeed,  the  classical  addition  theorem  for  Legendre  (Gegenbauer)  polynomials  can  be 
written  as  follows  (see  [E],  p.  178): 


(7/) (cos  6  cos  tp  +  sin  6  sin  tp  cos  tp) 


=  ^  2m(2A  +  2m  -  l)(n  -  m)!- 


[(A), 


m— 0 


(2A-1)  n+m  + 1 

sin<))-"Cy;;(cos(l)(sin,«”'Cy;;(cosV')Cyi/2(cos^) 
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and  hence,  for  A  =  d/2,  and  u  :=  cos(^>,  u  £  /,  we  have 


CfJ2{ cos  $  cos  tfj  +  u  sin  $  sin  tfj 

n 

_  -^2  2m[d  +  2m  —  l)(n  —  m)! 


[(d/2)m]2 


m= 0 


l)n+m+l 

X  (sin 9)m Cn-m"1  {cos  d)(sin  ?/)mW)^|)m(cos  ^)C^f_1^2(it). 

We  now  insert  this  into  (10.1)  and  use  the  fact  that  C'//_1^2(n),  m  =  1,2,...,  are 
orthogonal  to  the  constants  in  L2(I,  w^-i)/2)  to  obtain 

=  \Bd~2Ul  -  C)lJ-1|/2htvW-C'X2(coS())C.f2(cosS,)  /(l  -  du. 

(d  —  1  )n+1  ® 


This  implies  (10.2). 

We  finally  use  (10.2)  to  obtain 


J  Bd 

=  I B 


cf/2(<i  •  x)cyy  •  x)  &  =  /  K(cy2(,  •  X);  c  ticmyo  a 
r(n  +  rf)  C"/2(’' ' 0 /iOt2^)]2!!  -  t2r-"'2dt 


d_2-2d-2T2(d/t)n\ 


=  'ft.xCfW  0, 


where 


-?n,d-\B  I  r(n  +  J)  Kd/2- 

Simple  calculations  show  that  this  is  (3.10).  See  [RK],  □ 

A3.  Proof  of  (3.16).  The  following  relation  between  contiguous  Gegenbauer  poly¬ 
nomials  holds  (see  [E],  p.  1 78,  (36)): 

(’>  +  A)cy;  =  (.\  -  a  [cy+1  -  cy,] .  a  >  i. 

Also,  Cq  (t)  =  1  and  C^(t)  =  2A t.  These  identities  readily  imply 


n  2 j  +  A  1  £i\_\ 

]=0 


A  -  1 


n  —  2j‘ 


(10.3) 


Simple  calculations  show  that  (10.3)  with  A  =  d/2  (d  >  2)  is  (3.16).  □ 

A4.  Proof  of  (3.11).  Identity  (3.11)  follows  from  the  fact  that  Un(£  ■  x)  (as  a 
function  of  £)  is  a  spherical  polynomial  in  7in  (Sdin-2  ®  •  •  •  ®  7de  and  vnUn{^  ■  T])/Un(  1) 
is  the  reproducing  kernel  for  this  space  (see  (3.17)).  □ 
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